It's a scrape

You may not know this, but not every song you can listen to at The Sixtyone, was uploaded by the artist, or an artist's agent. Some of the songs that arrive at the Browse Recently Uploaded pages are not uploaded at all, but downloaded by T61's own robot: a script that the developers run daily to check some selected music blogs for new mp3's. Any new audio file that meet certain criteria will then be downloaded and added to the music pool.
The whole process of checking and downloading content from other websites is called scraping. So any song that T61 fetches from other sites is called a mp3 scrape. Scrapes used to be identifiable on T61 by an extract of the original blog post in the song's comment area, but not anymore. If you don't care about the game or interacting with artists, you probably don't care where the music at T61 comes from. Otherwise keep reading to learn how you can tell the pushed from the pulled.

At this time there are seven music blogs that T61's scraping robot checks for new songs. These are Aquarium Drunkard, Welikeitindie, Soul Sides, Stereogum, Sucka Pants, Gorilla Vs. Bear and Palms Out Sounds. With some exception the pulled songs appear daily between 3:00 pm and 4:00 pm GMT (10am - 11am EST / 7am - 8am PST). If there's no artist account, one will be created on the fly. In that case the robot also checks Last.fm for an account and scrapes bio and pictures from it. For an example take a look at the T61 and Last.fm profiles of Jeff Mangum

The robot can't take human decisions. If it could, it would be able to avoid little mistakes like creating two accounts for the same artist. This happens when several blogs share music from the same artist. The slightest difference in the spelling of the artist's name causes the robot to create extra profile pages. Even minuscule typing errors can cause this behavior. The last time we saw this happen, was on January 4th, when the robot scraped three songs from Soul Sides. All three songs were performed by the recently passed away Jamaican artist Byron Lee, but two profiles were created, only because in two songs Dragonaires was miss spelled as Dragonnaires.

Some keen listeners who care about their Listener status, who like to compete with each other and/or just want to tag a song with their screen name, always make sure they know which scrapes get pulled in advance. Even T61 co-founder James Miao takes advantage of this possibility. Yours truly likes these aspects of T61 very much, so he's not the one to complain about advance knowledge, but competition becomes more fun, when there are more competitors, don't you think? Worse is that a lot of listeners give feedback on scraped music or artists, while these artists possibly won't talk back at all. They don't know they're on T61.



As a matter of fact, they possibly don't know at all how well their music was sub legally spread around the blogosphere. The whole mp3 blogging phenomena seems to be based on the assumption that when an artist allows one blog to share some music, he unknowingly allows every blog to copy and share it as well. I guess all is fine as long as the artist remains unsigned, but what if the indie becomes a major? Isn't it more than likely that their record company demands that T61 withdraws the music from the pool?

So there you have three reasons why mp3 scrapes should be clearly flagged as scrapes. They sometimes cause annoying errors like duplicate profiles, some listeners wrongfully think they can interact with the scraped artists and it's not certain at all whether mp3 scraping is a legitimate action. It would be best if the developers of T61 reinstate some identification, but as long as they don't, here's what you can do to identify the scrapes yourself.

Method 1. Create an account at The Hype Machine. This site scrapes music from hundreds of blogs, including the seven mentioned before. Once your account is active, click on the link Blog List in the footer of the hypem.com page, locate the blogs you want to follow and favorite them, by clicking on the grey heart. After that go to your profile dashboard, click Watchlist Songs and choose Via Blogs in the sub menu. Doing this opens a page that lists every song recently published by the favorited blogs. If you are used to following syndicated feeds in your browser or a special feed reader like the free Feedreader, you can also create a feed of this page.

Method 2: If you don't want to go through the hassle of creating yet another online account, just check my Watchlist Songs Via Blogs page at Hypem. You can even syndicate that page if you please.

We should be aware that the T61 developers can change their scraping policy whenever they want. So if you're still in doubt whether a certain song was scraped or uploaded, use either The Hype Machine or Elbows to do a song search. If the song was published by any music blog, these sites can tell you when and where.

For those of you who like to have some more information on the subject of audio scrapes, here are some interesting reads:

Evonity, 2009

A special thank you and big hug for my friend SallySilvera, who at my request kept track of almost every scraped mp3 during the last six weeks of 2008, to help me put this information together.

Story also available at Evonity.org

7 comments:

Michele Yamazaki said...

Great article, Ben.

sally said...

Thanks Ben! It's been an interesting adventure of detective work and discovery. I hope everyone takes the time to leave their comments in the forum!

Babble said...

Nicely done! I think the next research project is to figure out why certain scrapes are being selectively deleted.

ImOnlySleeping said...

I would suspect some scrapes disappear (say, Depeche Mode) when the take down notice shows up. Having there old hits circulating around the internet isn't helping them any. An interesting thing to follow up on is whether the disappearing songs have also been removed from the blogs that they came from.

Evonity said...

I expressed my concerns about human intervention in the forum. A request from the copyrights owner to remove songs, seems most the most valid explanation. I researched only a few occasions, in which the songs were not removed from the originating blog.

Babble said...

I have a record of 12 removals, 10 of which I found the originating blog and mp3 still in tact. All of these removals happened with the last month, coinciding with some recent activity and certain fb'ers. Additionally, each time a song is removed or "borked," it happens to be the last song played by a certain site owner.

Evonity said...

For the fourth day in row, there was no scraped music published at T61. Coincidence? I think so. Although I'm vain enough to think it's because of the story, it's more likely because of the number of songs that were deleted (see Babble's report). I assume T61 got to many requests from the copyright owners, to take away their songs.