Being able to choose which information about yourself to share and which to not share is one of the things many people simultaneously value and take for granted. This desire for some level of privacy and control over information about oneself has become even more relevant as more technology exists to gather and store information about people. The Internet especially has changed how much information people present about themselves to others, some of which they know they are sharing and some of which they do not. Sometimes people agree willingly to share information about themselves as a normal part of Internet transactions and socializing. Quite often people are completely unaware that they are sharing information, what it is being used for, or the scope and volume of information that is being shared about them.
In the U.S., legal privacy concerns often deal with government search and seizure, such as what information requires a court order or warrant to gather and what information can be gathered by government agencies freely on any individual. This legal boundary has evolved over time as new technologies develop and social values around privacy (as opposed to other values such as security) have changed. However, privacy concerns extend well beyond government surveillance. Privately owned companies hold much (if not all) of the data that we transmit over the Internet and their ability to use it responsibly and protect it from intrusion by attackers is critical. Even individuals can pose a risk toward our private information if they are motivated enough to follow the trails we leave online and choose to share that information without our consent. Whether they are hackers gaining illicit access or simply people we give access to our less public social media posts, these individuals can turn our previously private data public with the click of a mouse.
While Internet technology presents many new ways for our personal information to be shared with or without our consent, it also offers new methods for privacy and anonymity. The mere structure of much of the Internet is such that we are functionally anonymous from most people with whom we interact, since tracking someone over the Internet is not the default use of the technology and takes some amount of technical skill. Encryption presents methods of keeping private those conversations we want to protect even from the most technically inclined observers, allowing both truly private conversations and ways of confirming the identity of those we interact with. Anonymizing tools, such as Tor, allow users to hide their identity from would-be trackers, allowing them to have interactions with partners they may not trust to know who they truly are or where they're located in the physical worlds. These tools exist and can work as a counter to the privacy-diminishning nature of much of the rest of the Internet, but often require knowledge and effort that some users may not be prepared for.
Privacy, at least from government intrusion, was a founding principle of the United States and is embodied in the Fourth Amendment to the U.S. Constitution. This amendment requires that residents should be free from "unreasonable search and seizure" by law enforcement agents. In this context, a search is any intrusion into a space in which a person might assume was private (such as homes, offices, or private documents.) Likewise, a seizure is any situation where a government agent has taken possession of a person's private property or physically detains the person. Often, decisions about whether a search or seizure should require a warrant (or court order) center around whether the person targeted in the search or seizure would have a "reasonable expectation of privacy." For instance, there is no reasonable expectation of privacy if a letter is left visible on a park bench. However, there would be a reasonable expectation of privacy if that same letter were inside a drawer in the person's home. Applying these principles in a world of advanced technology has been and likely will continue to be a challenge.
One of the primary considerations for the authors of the Constitution were physical possessions and physical bodies. As electronic communications developed and increased in importance, there were many cases where their place in relation to the Fourth Amendment had to be decided, clarified, and sometimes re-thought. One early such case was Olmstead v. United States in 1928, in which the Supreme Court decided that telephone communications were not protected by the Fourth Amendment and that law enforcement could us a wire tap to listen to a person's telephone communications without a warrant. Their reasoning was that no entry of a private evidence was done to tap a phone line and that things that other sounds which can be heard outside a private place (such as raised voices) does not require a warrant. This decision was reversed a few decades later in the case of Katz v. United States in 1967, where it was ruled that telephone wire taps did require a warrant if the conversation was had by individuals in locations which otherwise would be assumed to be private. This decision has been the basis for many other court decisions and law around electronic communications over time. Shortly after the Katz decision, Congress passed the Omnibus Crime Control and Safe Streets Act which established rules for when and how a warrant could be obtained for a telephone wire tap. The Electronic Communications Privacy Act of 1986 extended these rules to include electronic communications such as email.
One of the other challenges that technology provides to the Fourth Amendment is that surveillance can be done from long distances, reveal quite a lot of information, and without the target or anyone being aware that the surveillance ever occurred. For instance, the Supreme Court ruled in 2001's Kyllo v. United States that law enforcement needed to obtain a warrant to perform surveillance with a thermal imaging device, which shows heat patterns inside a building. In fact, the decision restricted the use of any surveillance device that was not in common use and publicly available, because a person could have a reasonable expectation of privacy. This decision, therefore, extends to many devices which are not publicly available, but may mean that such devices would become available to law enforcement without a warrant as they become more common.
While in the U.S., the law primarily focuses on protecting citizens from government intrusion, some other legal systems include even more broad definitions of privacy and control over information about oneself. Very few legal protections exist in the U.S. with regard to what personal data can be gathered by private companies or organization. The main exception to this is the Health Insurance Portability and Accountability Act of 1996, which gives strict rules for how health care information can be gathered, stored, and shared between health care providers and keeps most information from being released by health care providers without explicit permission from the patient. However, some other governments have much broader legal privacy rights for individuals and aim to provide individuals more control over information about themselves. Ireland's Data Protection Act of 1988 requires that companies provide copies of all data gathered on a user within 44 days of requesting it. Furthermore, the European Union's highest court issued a decision in 2014 that supported a "right to be forgotten," meaning that E.U. citizens have a right to have search engine results about themselves removed. Private companies such as Facebook argue that these sorts of rights over information harm their business model and ability to innovate. For instance, Facebook has argued that if they were to release all information gathered about their users it would reveal trade secrets about their algorithms and systems which could give their competitors a chance to copy the methods and harm their profitability. This balance between the individual's and service provider's ownership of data is something that will likely need to be addressed by governments worldwide in the future as the scope of data gathered about individuals and ways in which it can be used continue to expand.
(Original text from EFF and EFF Internet Law Treatise under CC-BY-NC-SA license, edited and expanded by Sofia Lemons.)
Privacy issues on the Internet often arise from the growing practice of data collection. Internet communication and commerce increasingly use specific and detailed data about Internet consumers and their online behavior, innovating ever more sophisticated technology to track the activities of users. Not only have the federal and state governments taken an interest in regulating data collection, there is also a significant focus on industry self-regulation. In addition, private parties continue to file lawsuits related to data collection practices. In particular, there is increased sensitivity and regulation related to financial data, medical data, and data collected from children.
In addition to direct requests for information, most online services also use some form of tracking technology to collect information while a user “surfs” a particular site or the Internet generally. This data can be used to create a record of a user’s system information, online communications, transactions and other activities, including websites visited, pages and ads viewed, purchases made and more. This record of a user’s session on the Internet is sometimes called transactional data. Examples of such data include:
IP Address. When a user connects to the Internet through an ISP, the user’s ISP assigns the computer a numeric Internet Protocol address. The IP address allows the user’s computer to communicate with the servers of the Web sites he or she visits, and may be traced to the ISP or, in some cases, computer owner. Generally, IP addresses are automatically gathered and maintained by the Web site.
Cookies. “Cookies” are small data text files that are sent from a server computer to a recipient computer during a browsing session. Cookies allow a Web site server to “remember” what the user did when he or she visited the site; for example, when the last visit occurred and which pages were viewed at that time. While a cookie identifies an individual user’s computer in the sense that it can distinguish one computer from another, it typically does not know the actual identity of the user. Generally, cookies do not pose a threat in terms of destroying or compromising a system.
Referrers. Some online services may collect a “referrer” from the user’s Web browser which references the last URL that the user has visited. Such information can be used to identify and track a user’s movement across the Web.
GUIDs. A Globally Unique Identifier is an alphanumeric identifier for a unique file or installation of software, or a particular user. For example, a website may assign a GUID to a user’s browser to track the user’s session, or the Windows operating system may use a GUID to identify software or other files created or downloaded on the user’s hard drive.
Web Bugs. Web bugs are small, transparent image files, typically 1-by-1 pixel in size, placed on websites or in email messages. Site operators and email marketers may use Web bugs to determine whether an email message has been read; to identify the IP addresses of users’ computers; to know the time when a Web page was viewed or an email message read; and to retrieve cookie information. Web bugs are often used in conjunction with other data collection techniques.
By themselves, cookies and other tracking technology typically do not reveal the actual identity of an individual. When matched with personal information provided by the user (such as registration data), however, the data can be used to create a profile of a specific user. Such information as identifies a user or is linked to information that identifies a user is generally deemed personal information. Cookies and other tracking technology enhance the browsing experience by identifying the user with his or her previously selected preferences or activities during earlier visits, which “personalizes” the site for the user’s repeated visits. Important to website operators and their advertisers, such technology also allows Web sites to develop profiles of page views (“hits”) to the site or hit logs, as well as the preferences of individual users.
Typically, a website asks a user to agree to its terms of service before creating an account, logging in, or accessing certain features. However, these terms of service are often vague, long, and use complicated language in ways that make it difficult for users to understand what they are really agreeing to. Some privacy advocates argue that there should be reform in how terms of service are provided to users, and possibly stricter limitations on what data companies are permitted to gather regardless of what terms a user agrees to. One major distinction in privacy practices is that some programs and sites offer privacy options as opt-in, while others offer privacy only for those who opt-out. An opt-in policy is one in which the site or program will not track your data unless you select the option to allow them to do so. This presents the user with the active choice of whether or not to give permission for specific data to be tracked. When a site or program simply starts gathering data, but provides an option somewhere that users can seek out and select in order to stop this kind of data from being tracked, this is an opt-out policy. This means that users may not be fully aware of the specific kinds of data being tracked about them and have to go to extra efforts to keep specific kinds of data about them from being tracked.
Data brokers are companies that trade in information on people -- names, addresses, phone numbers, details of shopping habits, and personal data such as whether someone owns cats or is divorced. This information comes from easily accessible public data (such as data from the phone book) as well as from less accessible sources (such as when the DMV sells information like your name, address and the type of car you own). As Natasha Singer of the New York Times described in her portrait of data broker Acxiom in 2012, “If you are an American adult, the odds are that it knows things like your age, race, sex, weight, height, marital status, education level, politics, buying habits, household health worries, vacation dreams — and on and on.”
Data brokers make money by selling access to this information. Some companies deal specifically with regulated businesses purposes, such as helping employers run background checks on job applicants. Other data brokers sell or rent the data for marketing purposes.
But details about where these companies get all of their data are still fuzzy. Representative Edward Markey (D-Mass), Representative Joe Barton (R-TX) and six other lawmakers sent open letters to data brokers last year demanding answers about their business practices. The letters asked the companies to "provide a list of each entity (including private sources and government agencies and offices) that has provided data from or about consumers to you."
The companies gave vague responses. For example, in its 30-page response (PDF), Acxiom stated:
This question calls for Acxiom to provide information that would reveal business practices that are of a highly competitive nature. Acxiom cannot provide a list of each entity that has provided data from, or about, consumers to us.
While users can "opt out" of being tracked by individual data brokers, there is no way to be certain that they have opted out of every single current data broker or to guarantee that they will be opted out of tracking by any new data broker companies that might form.
(Original text from EFF and The Internet Society under CC-BY-NC-SA license, edited and expanded by Sofia Lemons.)
Encryption is the mathematical science of codes, ciphers, and secret messages. Throughout history, people have used encryption to send messages to each other that (hopefully) couldn't be read by anyone besides the intended recipient.
Today, we have computers that are capable of performing encryption for us. Digital encryption technology has expanded beyond simple secret messages; today, encryption can be used for more elaborate purposes, for example to verify the author of messages or to browse the Web anonymously with Tor.
Encryption technologies facilitate anonymous communication, a potential lifeline for citizens and activists under oppressive regimes and individuals in vulnerable communities, such as victims of domestic abuse, those in witness protection programs, and undercover police officers. The same technology, however, also can help bad actors hide activities and communications by using anonymity tools for cyber-bullying and other forms of online abuse.
Symmetric encryption uses an identical key to encrypt and decrypt the message. Both the sender and the receiver have access to the same key. While fast and efficient for computers, symmetric encryption must ensure that the key is reliably delivered to the recipient and does not fall into the wrong hands.
Asymmetric encryption, also known as public-key encryption, is a one-way form of encryption. Keys come in pairs, and information encrypted with the public key can only be decrypted with the corresponding private key, or information can be encrypted with the private key and can only be decrypted with the public key. A person publishes their public key for others to use when communicating with them. Then they can use their private key either to decrypt messages that others send to them. It is similar to a locked mailbox in which mail can be pushed through a slot for delivery, but retrieved only by the owner with a key. Public-key encryption is more secure than symmetric encryption because the private key is never sent out to others and therefore can't be intercepted by an interloper.
Public-key encryption also allows users to provide a "digital signature" to verify their identity to the recipient of their messages. Because no one else should have the sender's private key, they can "sign" a message by encrypting it with their private key. The recipient can use their copy of the sender's public key to decrypt the message. If the message decrypts correctly, they know that the sender was, in fact, the person they thought them to be. However, this does not offer any extra security, as anyone could read the sender's message by decrypting it with their public key.
Because an encrypted message can then be encrypted again with a second key, the preferred method for having a private exchange in which identities are verified is to encrypt and sign the messages being sent. The sender encrypts the message with their own private key "signature", as described above. They then encrypt the resulting message with the recipient's public key, ensuring that only the recipient can read it. The recipient first decrypts the message with their own private key, then decrypts it with the sender's public key to verify the sender's identity. If both parties follow this strategy throughout the exchange, they can be certain that both their conversation is secure and that the person with whom they are communicating is really the person they intended.
The existence of an encryption backdoor means that encryption software is written so an authorized third party (such as a government surveillance group) can gain access to and decrypt encrypted data without access to the keys. These allow for provisions such as wire tapping of digital communications even when they are encrypted. But such backdoors also would allow covert access to content. The technical consensus is that introducing backdoors by any of the currently proposed technique puts legitimate users at risk and is unlikely to prevent criminals from communicating clandestinely. Bad actors will likely find alternative means of communicating, while average users may not have the same tools. This could both leave criminal communications immune from observation and leave user communications vulnerable to observation and interception by governments or bad actors, who have discovered how to exploit the backdoors.
(Original text from EFF and Tor, edited and expanded by Sofia Lemons.)
Many people don't want the things they say online to be connected with their offline identities. They may be concerned about political or economic retribution, harassment, or even threats to their lives. Whistleblowers report news that companies and governments would prefer to suppress; human rights workers struggle against repressive governments; parents try to create a safe way for children to explore; victims of domestic violence attempt to rebuild their lives where abusers cannot follow.
Instead of using their true names to communicate, these people choose to speak using pseudonyms (assumed names) or anonymously (no name at all). For these individuals and the organizations that support them, secure anonymity is critical. It may literally save lives.
Anonymous communications have an important place in our political and social discourse. The Supreme Court has ruled repeatedly that the right to anonymous free speech is protected by the First Amendment. A frequently cited 1995 Supreme Court ruling in McIntyre v. Ohio Elections Commission reads:
Anonymity is a shield from the tyranny of the majority. . . . It thus exemplifies the purpose behind the Bill of Rights and of the First Amendment in particular: to protect unpopular individuals from retaliation . . . at the hand of an intolerant society.
The tradition of anonymous speech is older than the United States. Founders Alexander Hamilton, James Madison, and John Jay wrote the Federalist Papers under the pseudonym "Publius " and "the Federal Farmer" spoke up in rebuttal. Charlotte Bronte wrote Jane Eyre under a male name to avoid discrimination against women authors. The US Supreme Court has repeatedly recognized rights to speak anonymously derived from the First Amendment.
Many argue that these long-standing rights to anonymity and the protections it affords are critically important for the Internet. The Supreme Court has recognized the Internet offers a new and powerful democratic forum in which anyone can become a "pamphleteer" or "a town crier with a voice that resonates farther than it could from any soapbox." Others argue, however, that anonymity frees people from the consequences of their words and actions, and that it removes elements such as civility and empathy from the online discourse.
Tor (The Onion Routing) networks and the Tor browser are tools for allowing Internet users to remain anonymous online, hiding their IP address from those who might be trying to track them. It does this by distributing users' transactions over several places on the Internet, so no single point can link them to their destination. The idea is similar to using a twisty, hard-to-follow route in order to throw off somebody who is tailing you — and then periodically erasing your footprints. Instead of taking a direct route from source to destination, data packets on the Tor network take a random pathway through several relays that cover the user's tracks so no observer at any single point can tell where the data came from or where it's going.
To create a private network pathway with Tor, the user's software or client builds a circuit of encrypted connections through relays (nodes) on the network. The circuit is extended one hop at a time, and each relay along the way knows only which relay gave it data and which relay it is giving data to. No individual relay ever knows the complete path that a data packet has taken. The client negotiates a separate set of encryption keys for each hop along the circuit to ensure that each hop can't trace these connections as they pass through.
For efficiency, the Tor software uses the same circuit for connections that happen within the same ten minutes or so. Later requests are given a new circuit, to keep people from linking your earlier actions to the new ones.
The path and data in a Tor circuit are kept secret using public key encryption. Each node provides a copy of its public key to the sender's Tor program. Data is encrypted with each of the nodes' public keys, one after the other like layers of an onion (hence the onion in Tor.) This is similar to a package locked in a box, then enclosed in another locked box, and so on. When the message arrives at a node, that node's private key is used to decrypt the part of the message that gives the IP address of the next node in the circuit. The node passes the data along to the next node, which repeats the same actions with the data until it reaches its destination. Only the final recipient is given the original data, and each node only learns the IP address of the node before and after it in the circuit.