Anda di halaman 1dari 2

15/05/2016

YetanothermethodtograbdownloaddisabledslideshowsfromSlideShareLewis'Blog

Yes,Iknow.Horrible,horriblesubject.Thethoughtofstealingjpgswhichare
publiclyviewable...Oh,well.
Standarddisclaimerapplies:Teachingsomeonehowtostealabookdoesnot
maketheteacherguiltyoftheft.Ifyougetintroubleforfollowingthese
directions,shameonyou,notonme.
So,asaproofofconcept,IwascuriousastowhatSlideSharedoestoinhibit
downloadingofpresentations.Apparently,alltheydoisnotprovidethe(original?)
PowerPointdocumentfordownload1,2.However,ifoneexaminesthesourceof
thepage,itisfairlyeasytodeterminethefilenameofeachslideimage,andthen
automateafetchtograbeachone.

Webbrowserorsomethingtoretrievethesourceofoneoftheslideshow'spages(well,sinceyou'rereadingthis,I
supposewehavethisonecovered)
cURL(lookforaversioncompatiblewithyourOSstarthere)

That'sit.

1.Openthepagecontaininganyslideinthesetyouwanttodownload.
2.Viewthesourceofthepage(inMozillabasedbrowsers,thisisusuallyaccomplishedwithCtrlU).
3.Searchfor"og:image"inthesource,andcopytheurlwhichfollows.
4.Notetheslidecountinthelowerleftofthepresentation.
5.Openaterminal(commandpromptorwindowsession).
6.Navigatetowhereyouwouldliketosavethedownloadedimages.
7.RunthefollowingcURLcommand:

curlOhttp://image.slidesharecdn.com/<nameofpresentationincludingnumericstring>phpapp02

Searchingforog:imageinthesource,wefind:

<!fbopengraphmetatags>

<metaname="fb_app_id"property="fb:app_id"class="fb_og_meta"content="7890123456"/>
<metaname="og_type"property="og:type"class="fb_og_meta"content="slideshare:presentation"/>
<metaname="og_url"property="og:url"class="fb_og_meta"content="http://www.slideshare.net/somedirectory/somepresentation
<metaname="og_image"property="og:image"class="fb_og_meta"content="http://image.slidesharecdn.com/somepresentation12345

Theurlspecifiedbyog_imageis:
http://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide11024.jpg

Assumethattheslidecountis55(i.e.,onthefirstslide,thelowerleftindicates
http://www.2rosenthals.net/wordpress/yetanothermethodtograbdownloaddisabledslideshowsfromslideshare727/

1/2

15/05/2016

YetanothermethodtograbdownloaddisabledslideshowsfromSlideShareLewis'Blog

"1/55").OnceinthedirectorywhereIwanttosavetheimages,Isimplytell
cURL:
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide[155]1024.jpg

andcURLwillretrieveeachjpginthedeck.

TheOoptiontellscURLtosavethedataastheoriginalfilename.Withoutthis,
cURLwilldutifullyretrieveadatastream,whichisoflittleuse.
The[155]tellscURLtosuccessivelydownloadthefilename,replacingthatspace
(betweenthedashesinthisexample)withthesubsequentnumber,e.g.:
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide11024.jpg
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide21024.jpg
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide31024.jpg
[...]
curlOhttp://image.slidesharecdn.com/somepresentation1234567890phpapp02/95/slide551024.jpg

Mynaturalinclinationwastousewgetforthis.However,wgetdoesnotsupport
globbingforhttp(nowildcards),andwhileIcouldhavefeditsomeregexto
specifyoneurlaftertheother,thisisahorriblyclumsywayofaccomplishingthe
task.

Thepointofallofthisisnottogoandripoffeverydownloaddisabled
presentationonSlideShare,butrathertopresentaworkingexampleofhowto
usecURLtoretrievesequentialfilenamesviahttp(orftp).Ifyoufindanother
gooduseforthisoneliner,pleasepostacommenttoletmeknow.
1.Pointoffact#1:Idon'tusePowerPoint,andIabsolutelygoballisticwhensomeoneemailsmeoneofthose
disgustinglyhugefileswhichImustthenconverttosomethingreadable(i.e.,praythatitwillopeninImpressand
thenallowmetosaveittoanImpressfileorbetter,apdf).
2.Pointoffact#2:Idonot(yet)haveanaccountonSlideShare,whichisapparentlyrequiredtodownloadany
presentationsfromtheirsite.

Relatedposts:
1.AsincereapologytousersofmyYUMrepomirror Nogooddeedgoesunpunished.Settingthingsinmotion...
2.Egad!Whydopeopledotheirownwebdevelopment? Theaveragepersonnowadaysbrusheshisorherownteeth,...
3.WhyshouldCPAscareaboutthecloud?Letscounttheways Whydoarticlessuchasthispresentsucha...
4.Multipledefaultroutes/publicgatewayIPsunderLinux Thisisonewaytosolveaparticularroutingproblem,...
5.Abduction!(SaveAsImage)modforSeaMonkey Abduction!allowsyouto:1)Captureanpageor...

http://www.2rosenthals.net/wordpress/yetanothermethodtograbdownloaddisabledslideshowsfromslideshare727/

2/2

Anda mungkin juga menyukai