Easily Download the Entire Cylinder Digitization Project

Discussions on Talking Machines & Accessories
Post Reply
User avatar
MordEth
Victor IV
Posts: 1157
Joined: Wed Jan 07, 2009 1:01 pm
Personal Text: Contact me for TMF tech support.
Location: Boston, MA
Contact:

Easily Download the Entire Cylinder Digitization Project

Post by MordEth »

By now, I figure that most of our members (or at least the ones that enjoy cylinder recordings) are aware of the Cylinder Digitization and Preservation Project. If not, definitely go check it out. Even if you do not want to get audio files from them, they can be a great resource for looking up information on your cylinders.

However, you might want to download the entire library of mp3s that they provide, if you own a Mac (running Mac OS X—sorry Henry) or Linux/Unix, it is very easy to download them all at once.

Regardless of which of those operating systems you have, you would do:

Code: Select all

cd $HOME/Desktop
mkdir cylinders
cd cylinders
This changes your working directory to your desktop, makes a ‘cylinders’ directory, and then changes into that directory.

Then, if you have a Mac, you would do:

Code: Select all

curl http://talkingmachine.info/scripts/cylinders.sh -o cylinders.sh
On Linux or Unix, which is more likely to have ‘wget’ instead of ‘curl’, you would do:

Code: Select all

wget http://talkingmachine.info/scripts/cylinders.sh
These commands download the script (displayed inline below) into that directory.

Then the directions converge again and you would do:

Code: Select all

chmod u+x ./cylinders.sh
./cylinders.sh
Which will make the script executable and then run it.

It will then go through and download several thousand MP3s, so it will take a while, and is not recommended for users still using dial-up.

[hr][/hr]
If there is any interest, I could probably come up with something for users on Windows, using Firefox and the DownThemAll! extension. Windows just does not offer the tools needed to do something like this without a good bit of help. ;)

However, I’m definitely willing to do the work if it will benefit someone who wants it.

[hr][/hr]
Here are the contents of the script:

Code: Select all

#!/bin/bash

#  MordEth's lazy script to grab the UCSB Davidson Library's cylinder mp3s.
#  Last updated on 2008.12.07.

#  Edit max based on the current cylinder count on UCSB's page:

MAX="8000"

#  Don't change below this point unless you know what you're doing.

NUM="1"
BASE="0000"

#  Determine whether we have curl or wget to grab the files:

which curl > /dev/null 2>&1

if [ $? -eq 0 ]
	then
		GET="curl"

	else
		which wget > /dev/null 2>&1
	
		if [ $? -eq 0 ]
			then
				GET="wget"

			else
				echo "You must have wget or curl installed to use this script."
				exit 1
		fi
fi		

while [ $NUM -le $MAX ]
	do
		if [ $NUM -le 9 ]
			then
				PAD="000${NUM}"

		elif [ $NUM -le 99 ]
			then
				PAD="00${NUM}"

		elif [ $NUM -le 999 ]
			then
				PAD="0${NUM}"

		else
			PAD="$NUM"
			BASE="`echo $NUM | cut -b1`000"
		fi

		FILE="cusb-cyl${PAD}d.mp3"
		URL="http://cylinders.library.ucsb.edu/mp3s/${BASE}/${PAD}/${FILE}"

		if [ "$GET" == "curl" ]
			then
				echo ""
				echo "Currently downloading:  $FILE"
				curl $URL -o $FILE

			elif [ "$GET" == "wget" ]
				then
					wget $URL
		fi

		NUM="`expr $NUM + 1`"
	done

#  Delete any "cylinders" that are actually HTML error messages:

rm -f `grep "html" *.mp3 | cut -f1 -d: | sort | uniq`
The University of California, Santa Barbara is doing a wonderful service in providing this.

Image

Let me know if you have any questions or if I can be of assistance in using this (or for anything else).

— MordEth

Proudly supporting phonograph discussion boards, hosting phonograph sites and creating phonograph videos since 2007.
Need web hosting or web (or other graphic) design? Support MordEth by using BaseZen Consulting for all of your IT consulting needs.
Want more phonograph discussion? Be sure to visit The Online Edison Phonograph Discussion Board.

JohnM
Victor VI
Posts: 3141
Joined: Fri Jan 09, 2009 2:47 am
Location: Jerome, Arizona
Contact:

Re: Easily Download the Entire Cylinder Digitization Project

Post by JohnM »

David,
How cool is that?! Thanks for making this possible/palatable/understandable/less intimidating for the members of this board. I can't tell you how great it is to have someone to help us integrate modern technology with our interest in old technology. Thanks so much (again)! John M
"All of us have a place in history. Mine is clouds." Richard Brautigan

User avatar
MordEth
Victor IV
Posts: 1157
Joined: Wed Jan 07, 2009 1:01 pm
Personal Text: Contact me for TMF tech support.
Location: Boston, MA
Contact:

Re: Easily Download the Entire Cylinder Digitization Project

Post by MordEth »

JohnM wrote:David,
How cool is that?! Thanks for making this possible/palatable/understandable/less intimidating for the members of this board. I can't tell you how great it is to have someone to help us integrate modern technology with our interest in old technology. Thanks so much (again)
John,

You’re very welcome—I wrote that (originally) quite a while ago, back when the project was featured on Slashdot (an older—and still very busy—tech news site of all places), and while I was into cylinders at that time, I knew that John would love to have the MP3s, and he was still using dial-up at that point.

So I wrote it to download them all while I slept or did other things. John’s a great friend, but I am not going to click through a few thousand pages for him. ;)

Since then I cleaned it up to check how your systsem is able to grab files, then loop through and grab them. When they have more than 8,000 songs available, all I have to do is change this line:

Code: Select all

MAX="8000"
To a more appropriate number.

I believe that I saw the Phonoautograph there, as well, although it may have been on Wired instead (or on both sites). Sometimes it’s strange where you might find things...

Hopefully my tutorials in the Board Tech section have proved equally helpful, although I owe you guys a lot more of them. ;)

Your friendly internet daemon,

MordEth
Proudly supporting phonograph discussion boards, hosting phonograph sites and creating phonograph videos since 2007.
Need web hosting or web (or other graphic) design? Support MordEth by using BaseZen Consulting for all of your IT consulting needs.
Want more phonograph discussion? Be sure to visit The Online Edison Phonograph Discussion Board.

Post Reply