YetanothermethodtograbdownloaddisabledslideshowsfromSlideShareLewis'Blog
Yes,Iknow.Horrible,horriblesubject.Thethoughtofstealingjpgswhichare
publiclyviewable...Oh,well.
Standarddisclaimerapplies:Teachingsomeonehowtostealabookdoesnot
maketheteacherguiltyoftheft.Ifyougetintroubleforfollowingthese
directions,shameonyou,notonme.
So,asaproofofconcept,IwascuriousastowhatSlideSharedoestoinhibit
downloadingofpresentations.Apparently,alltheydoisnotprovidethe(original?)
PowerPointdocumentfordownload1,2.However,ifoneexaminesthesourceof
thepage,itisfairlyeasytodeterminethefilenameofeachslideimage,andthen
automateafetchtograbeachone.
Webbrowserorsomethingtoretrievethesourceofoneoftheslideshow'spages(well,sinceyou'rereadingthis,I
supposewehavethisonecovered)
cURL(lookforaversioncompatiblewithyourOSstarthere)
That'sit.
1.Openthepagecontaininganyslideinthesetyouwanttodownload.
2.Viewthesourceofthepage(inMozillabasedbrowsers,thisisusuallyaccomplishedwithCtrlU).
3.Searchfor"og:image"inthesource,andcopytheurlwhichfollows.
4.Notetheslidecountinthelowerleftofthepresentation.
5.Openaterminal(commandpromptorwindowsession).
6.Navigatetowhereyouwouldliketosavethedownloadedimages.
7.RunthefollowingcURLcommand:
curlOhttp://image.slidesharecdn.com/<nameofpresentationincludingnumericstring>phpapp02
Searchingforog:imageinthesource,wefind:
<!fbopengraphmetatags>
<metaname="fb_app_id"property="fb:app_id"class="fb_og_meta"content="7890123456"/>
<metaname="og_type"property="og:type"class="fb_og_meta"content="slideshare:presentation"/>
<metaname="og_url"property="og:url"class="fb_og_meta"content="http://www.slideshare.net/somedirectory/somepresentation
<metaname="og_image"property="og:image"class="fb_og_meta"content="http://image.slidesharecdn.com/somepresentation12345
Theurlspecifiedbyog_imageis:
http://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide11024.jpg
Assumethattheslidecountis55(i.e.,onthefirstslide,thelowerleftindicates
http://www.2rosenthals.net/wordpress/yetanothermethodtograbdownloaddisabledslideshowsfromslideshare727/
1/2
15/05/2016
YetanothermethodtograbdownloaddisabledslideshowsfromSlideShareLewis'Blog
"1/55").OnceinthedirectorywhereIwanttosavetheimages,Isimplytell
cURL:
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide[155]1024.jpg
andcURLwillretrieveeachjpginthedeck.
TheOoptiontellscURLtosavethedataastheoriginalfilename.Withoutthis,
cURLwilldutifullyretrieveadatastream,whichisoflittleuse.
The[155]tellscURLtosuccessivelydownloadthefilename,replacingthatspace
(betweenthedashesinthisexample)withthesubsequentnumber,e.g.:
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide11024.jpg
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide21024.jpg
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide31024.jpg
[...]
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide551024.jpg
Mynaturalinclinationwastousewgetforthis.However,wgetdoesnotsupport
globbingforhttp(nowildcards),andwhileIcouldhavefeditsomeregexto
specifyoneurlaftertheother,thisisahorriblyclumsywayofaccomplishingthe
task.
Thepointofallofthisisnottogoandripoffeverydownloaddisabled
presentationonSlideShare,butrathertopresentaworkingexampleofhowto
usecURLtoretrievesequentialfilenamesviahttp(orftp).Ifyoufindanother
gooduseforthisoneliner,pleasepostacommenttoletmeknow.
1.Pointoffact#1:Idon'tusePowerPoint,andIabsolutelygoballisticwhensomeoneemailsmeoneofthose
disgustinglyhugefileswhichImustthenconverttosomethingreadable(i.e.,praythatitwillopeninImpressand
thenallowmetosaveittoanImpressfileorbetter,apdf).
2.Pointoffact#2:Idonot(yet)haveanaccountonSlideShare,whichisapparentlyrequiredtodownloadany
presentationsfromtheirsite.
Relatedposts:
1.AsincereapologytousersofmyYUMrepomirror Nogooddeedgoesunpunished.Settingthingsinmotion...
2.Egad!Whydopeopledotheirownwebdevelopment? Theaveragepersonnowadaysbrusheshisorherownteeth,...
3.WhyshouldCPAscareaboutthecloud?Letscounttheways Whydoarticlessuchasthispresentsucha...
4.Multipledefaultroutes/publicgatewayIPsunderLinux Thisisonewaytosolveaparticularroutingproblem,...
5.Abduction!(SaveAsImage)modforSeaMonkey Abduction!allowsyouto:1)Captureanpageor...
http://www.2rosenthals.net/wordpress/yetanothermethodtograbdownloaddisabledslideshowsfromslideshare727/
2/2