Tag Archives: Carl Malamud

Aaron’s PACER Project Explained

Here’s a clip from the film “The Internet’s Own Boy”  – Directed by Brian Knappenberger – which explains the PACER project in more detail. [This is background for our Next Raw Thought Salon on March 8th.]

Clip on the Internet Archive

Clip on YouTube

PACER is the name of the website that lawyers use to retrieve legal documents from current and past court cases. These documents make up the precedents that make up “the law,” yet to access documents on PACER you must have a credit card and pay per page. (Costing a dime or more for *each* page, so you can see how it can add up quickly. )

You can understand why this “pay to see the law” system could present a problem for anyone who doesn’t have a credit card or is unfamiliar with the details of legal proceedings.

Aaron learned of a program which enabled free access to PACER via a small group of libraries across the country, and coordinated with a friend to download millions of PACER documents.

The FBI didn’t like it, and investigated him for a while, including surveillance at his parent’s home. But ultimately it had to let it go, because Aaron hadn’t actually done anything illegal.

Below is a transcription of the PACER Section of “The Internet’s Own Boy (Directed by Brian Knappenberger)

Brewster Kahle – Founder, Internet Archive:

“How can you bring public access to the public domain? It may sound obvious that you would have public access to the public domain, but in fact, it’s not true. So, the public domain should be free to all, but it’s often locked up. There’s often guard cages. It’s like having a National Park but with a moat around it and gun turrets pointed out, in case somebody might want to come and actually enjoy the Public Domain.

One of the things Aaron was particularly interested in was bringing public access to the public domain. It was one of the things that got him into so much trouble.”

Stephen Shultze – Former Fellow, Berkman Center for Internet and Society at Harvard:

“I had been trying to get access to Federal Court records in the United States. What I discovered was a puzzling system, called PACER, which stands for “Public Access to Court Electronic Records.

I started Googling and that’s when I ran across Carl Malamud.”

Narrative: “Access to legal materials in the United States is a 10 billion dollar per year business.”

Carl Malamud – Founder, Public.Resource.org

“PACER is just this incredible abomination of government services. Ten cents a page. It’s this most brain dead code you’ve ever seen. You can’t search it. You can’t bookmark anything. You’ve gotta have a credit card. And these are “public records.”

U.S. District Courts are very important. That’s where a lot of our seminal legislation starts. Civil Rights cases. Patent cases. All sorts of stuff. And journalists and students and citizens and lawyers all need access to PACER and it fights em every step of the way.

People without means can’t see the law as readily as people with that American Express card. It’s a poll tax on access to justice.”

Tim O’Reilly, Publisher

“The law is the operating system of our democracy, and you have to pay to see it? That’s not much of a democracy.”

Stephen Shultze: “They make about 120 million dollars a year on the PACER system and it doesn’t cost anything near that, according to their own records.

In fact, it’s illegal. The E-government Act of 2002 states that the courts may charge “only to the extent necessary” in order to reimburse the costs of running pacer.”

Narrator: “As the founder of Public.Resource.org, Malamud wanted to protest the PACER charges.

He started a program called “The PACER Recycling Project.” People could upload documents they had already paid for to a free database, so others could use them.”

Carl Malamud: “The PACER people were getting a lot of flack from congress and others about public access. And so they put together this system in seventeen (17) libraries across the country, there was free PACER access. That’s one library every 22,000 square miles I believe. So it wasn’t like really convenient.

I encouraged volunteers to join the “thumb drive core” and download docs from the public access libraries and upload them to the PACER recycling site. People take a thumb drive into one of these libraries and they download a bunch of documents and then send em to me. And it was just a joke. In fact if you clicked on “thumb drive core,” the Wizard of Oz, ya know, the munchkins singing, video clip came up.

But of course, I get this phone call from Steve Shultze and Aaron saying “Gee, we’d like to join the Thumb Drive Core.”

Stephen Shultze: “Around that time, I ran into Aaron at a conference. So I approached him and said “hey, I’m thinking about doing an intervention on the PACER problem.”

Narrator: “Shultze had already developed a program that could automatically download PACER documents from the trial libraries. Swartz wanted to take a look.”

Stephen Shultze: “So, I showed him the code. And I didn’t know what would come next, but as it turns out, over the next few hours at that conference. He was off sitting in a corner, improving my code, recruiting a friend of his that lived near one of these libraries to go into the library and to begin testing his improved code, and at some point the folks at the court realized something’s not going quite according to plan.”

Carl Malamud: “And data started to come in, and come in, and come in. Soon there were 760 GB of PACER docs. About 20 million pages.”

Narrator: “Using information retrieved from the trial libraries, Swartz was conducting massive automated parallel downloading of the PACER system. He was able to acquire more than 2.7 million Federal Court Documents. Almost 20 million pages of text.

Carl Malamud: “Now, I’ll grant you that 20 million pages perhaps exceeded the expectations of the people running the pilot access project, but surprising a bureaucrat isn’t illegal.”

Aaron & Carl decided to talk to the New York Times about what happened.

They also got the attention of the FBI, who began to stake out Swartz’ parents’ house in Illinois.

Carl Malamud: “I get a tweet from his mother saying ‘Call me!’ And I’m like what the hell’s going on here? So, I finally got a hold of Aaron, and Aaron’s mother is like ‘oh my god FBI, FBI, FBI’ ”

Noah Swartz

Noah Swartz: “An FBI agent drives down our home’s driveway trying to see if Aaron is like, in his room. And I remember being home that day and wondering why this car was driving down our driveway and just driving back out. That’s weird. Like five years later I read the FBI file and I’m like my goodness – that was the FBI agent, in my driveway.”

Carl Malamud: “He (Aaron) was terrified. He was totally terrified. He was way more terrified after the FBI actually called him up on the phone and tried to sucker him in to coming down to a coffee shop without a lawyer. He said he went home and laid down on the bed, and was shaking.

Narrator: The downloading also uncovered massive privacy violations in the court documents. Ultimately, the courts were forced to change their policies as a result.

And the FBI closed their investigation without bringing charges.

Cory Doctorow

Cory Doctorow: “To this day, I find it remarkable that anybody, even at the most remote podunct field office of the FBI, thought that a fitting use for taxpayer dollars was investigating people for theft on the grounds that they had made the law public. How can you call yourself a “law man,” and think there can possibly be anything wrong in this whole world with making the law public.”

NY Times Article On Aaron’s Pacer Project

This is a reference for our post about Aaron’s Pacer Project Explained.

This was published in the New York Times on February 12, 2009.

An Effort to Upgrade a Court Archive System to Free and Easy

By JOHN SCHWARTZ

FEB. 12, 2009

Aaron Swartz used a free trial of the government’s Pacer system to download 19,856,160 pages of documents in a campaign to place the information free online. Photo:  Michael Francis McElroy for The New York Times

Aaron Swartz used a free trial of the government’s Pacer system to download 19,856,160 pages of documents in a campaign to place the information free online. Credit Michael Francis McElroy for The New York Times

Americans have grown accustomed to finding just about anything they want online fast, and free. But for those searching for federal court decisions, briefs and other legal papers, there is no Google.

Instead, there is Pacer, the government-run Public Access to Court Electronic Records system designed in the bygone days of screechy telephone modems. Cumbersome, arcane and not free, it is everything that Google is not.

Recently, however, a small group of dedicated open-government activists teamed up to push the court records system into the 21st century — by simply grabbing enormous chunks of the database and giving the documents away, to the great annoyance of the government.

“Pacer is just so awful,” said Carl Malamud, the leader of the effort and founder of a nonprofit group, Public.Resource.org. “The system is 15 to 20 years out of date.”

Worse, Mr. Malamud said, Pacer takes information that he believes should be free — government-produced documents are not covered by copyright — and charges 8 cents a page. Most of the private services that make searching easier, like Westlaw and Lexis-Nexis, charge far more, while relative newcomers like AltLaw.org, Fastcase.com and Justia.com, offer some records cheaply or even free. But even the seemingly cheap cost of Pacer adds up, when court records can run to thousands of pages. Fees get plowed back to the courts to finance technology, but the system runs a budget surplus of some $150 million, according to recent court reports.

To Mr. Malamud, putting the nation’s legal system behind a wall of cash and kludge separates the people from what he calls the “operating system for democracy.” So, using $600,000 in contributions in 2008, he bought a 50-year archive of papers from the federal appellate courts and placed them online. By this year, he was ready to take on the larger database of district courts.

Those courts, with the help of the Government Printing Office, had opened a free trial of Pacer at 17 libraries around the country. Mr. Malamud urged fellow activists to go to those libraries, download as many court documents as they could, and send them to him for republication on the Web, where Google could get to them.

Aaron Swartz, a 22-year-old Stanford dropout and entrepreneur who read Mr. Malamud’s appeal, managed to download an estimated 20 percent of the entire database: 19,856,160 pages of text.

Then on Sept. 29, all of the free servers stopped serving. The government, it turns out, was not pleased.

A notice went out from the Government Printing Office that the free Pacer pilot program was suspended, “pending an evaluation.” A couple of weeks later, a Government Printing Office official, Richard G. Davis, told librarians that “the security of the Pacer service was compromised. The F.B.I. is conducting an investigation.”

Lawyers for Mr. Malamud and Mr. Swartz told them that they appeared to have broken no laws, noting nonetheless that it was impossible to say what angry government officials might do.

At the administrative office of the courts, a spokeswoman, Karen Redmond, said she could not comment on the fate of the free trial of Pacer, or whether there had been a criminal investigation into the mass download.

The free program “is not terminated,” Ms. Redmond said. “We’ll just have to see what happens after the evaluation.” As for the system’s cost, she said: “We’re about as cheap as we can get it. We’re talking pennies a page.”

Carl Malamud has been leading the effort to push the court records system into the 21st century. Photo: Heidi Schumann for The New York Times

Carl Malamud has been leading the effort to push the court records system into the 21st century. Credit Heidi Schumann for The New York Times

Meanwhile, the 50 years of appellate decisions remain online and Google-friendly, and the 20 million pages of lower court decisions are available in bulk form, but are not yet easily searchable. “I want the whole database in 2009,” Mr. Malamud said.

Mr. Malamud, 49, has a long record of trying to balance openness with privacy, and has also pushed the Securities and Exchange Commission and the Patent and Trademark Office to put their records online free. But the issue is a thorny one with court documents, which often contain personal information.

Daniel J. Solove, a professor at the George Washington University Law School, noted that marketers skim court records for personal data, and making records easier to troll will put even more data at risk. “It’s taking away this middle ground that offered a lot of protection, practically, and throwing it into this radically wide open box,” he said.

Newsletter Sign Up

Continue reading the main story

California Today

The news and stories that matter to Californians (and anyone else interested in the state), delivered weekday mornings.

You will receive emails containing news content, updates and promotions from The New York Times. You may opt-out at any time.

See Sample Privacy Policy Opt out or contact us anytime

But this argument for what is known as “practical obscurity” does not convince Peter A. Winn, a privacy expert who is an assistant United States attorney in Washington State. Noting that he was speaking only for himself, he argued that the courts developed rules over the last 400 years to protect privacy.

“It worked in the bricks-and-mortar age — it should work in the electronic age,” Mr. Winn said. The administrative office of the courts, he said, should take on the role of policing privacy on its databases. “This is going to take focus and a lot of hard work,” he said.

Mr. Malamud agrees that the court system needs to do a better job of protecting privacy. He found thousands of documents in which the lawyers and courts had not properly redacted personal information like Social Security numbers, a violation of the courts’ own rules. There was data on children in Washington, names of Secret Service agents, members of pension funds and more.

“They’re pretty spectacular blunders,” he said. He sent letters to the clerks of individual courts around the country. After some initial inaction, and repeated and increasingly spirited notices from Mr. Malamud, most of the offending documents were pulled from the databases to be redacted.

Ms. Redmond, of the administrative office of the courts, said the courts comb through the documents “on a regular basis” and tell lawyers to redact confidential information. The number of violations, she noted, was relatively small.

Mr. Malamud scoffed at that. “This is a large number of transgressions, and this is illegal,” he said. “The law doesn’t say that you should only publish a small number of Social Security numbers!”

Mr. Malamud said his years of activism had led him to set a long-shot goal: serving in the Obama administration, perhaps even as head of the Government Printing Office. The thought might seem far-fetched — Mr. Malamud is, by admission, more of an at-the-barricades guy than a behind-the-desk guy. But he noted that he published more pages online last year than the printing office did.

Mr. Malamud represents a perspective of openness and transparency that is much in tune with the new administration’s, said Lawrence Lessig, a law professor at Harvard who is a leading advocate for free culture. “The principles are those that Carl has been at the center of defining,” he said.

The idea also seems to have a measure of appeal for John D. Podesta, a longtime fan of Mr. Malamud and head of the Obama transition team, who stopped short, however, of anything resembling an endorsement. “He would certainly shake things up,” Mr. Podesta said, laughing.

Mr. Malamud says he is not counting on the new administration’s being quite that bold. Besides, he said, he keeps himself awfully busy doing what he believes the government ought to be doing anyway.

“If called, I will certainly serve,” he said. “But if not called, I will probably serve anyway.”

2013 Post by Jason Leopold About Aaron’s PACER-related FOIA Requests

Jason Leopold, who is speaking Saturday at the San Francisco hackathon, and also that night at the evening event,  wrote about Aaron’s FOIA requests, immediately following his death.

We will be going through the articles referenced in this excerpt below, one by one.

Aaron Swartz’s FOIA Requests Shed Light on His Struggle

From the Truthout article:

Swartz filed his first FOIA request in December 2010, more than two years after he landed on the government’s radar. He was seeking information about himself.

In 2008, Swartz’s friend and fellow open government activist Carl Malamud, the founder of the nonprofit public.resource.org, wanted to make federal court documents housed on the Public Access to Court Electronic Records system (PACER) available to the public for free. Using $600,000 he raised from supporters, Malamud purchased 50 years worth of appellate court documents and posted them on his website.

Then, the government started a pilot program in which access to federal court documents on PACER would be made available to users at no cost at 17 libraries around the country. Malamud urged activists like Swartz to visit the libraries, download the documents and send it over to him so he could make it availble to the public via his website.

“So Aaron went to one of them and installed a small PERL script he had written that cycled sequentially through case numbers, requesting a new document from Pacer every three seconds, and uploading it to” Amazon’s Elastic Compute (EC2) Cloud server, Wired reported. “Aaron pulled nearly 20 million pages of public court documents, which are now available for free on the Internet Archive.”

The court documents Swartz legally accessed were worth $1.5 million. The government shut down the PACER pilot program and the FBI launched an investigtation. Malamud has since published on his website emails he exchanged with Swartz about the incident.

On December 10, 2010, Swartz filed a FOIA request with the Justice Department’s Criminal Division seeking “documents related to me, Aaron Swartz, as well as any documents related to any associated PACER investigation.” The Justice Department said responded by stating it could not locate any records. He also filed an identical FOIA request that day with the Executive Office of United States Attorneys. The office identified 72 documents that were withheld in full.