How do I select a data storage solution for my business?

We live in the age of data. We are constantly producing it, analyzing it, figuring out how to store and protect it, and, hopefully, using it to refine business practices. Unfortunately, 58% of organizations make decisions based on outdated data.

While enterprises are rapidly deploying technologies for real-time analytics, machine learning and IoT, they are still utilizing legacy storage solutions that are not designed for such data-intensive workloads.

To select a suitable data storage for your business, you need to think about a variety of factors. We’ve talked to several industry leaders to get their insight on the topic.

Phil Bullinger, SVP and General Manager, Data Center Business Unit, Western Digital

select data storage solutionSelecting the right data storage solution for your enterprise requires evaluating and balancing many factors. The most important is aligning the performance and capabilities of the storage system with your critical workloads and their specific bandwidth, application latency and data availability requirements. For example, if your business wants to gain greater insight and value from data through AI, your storage system should be designed to support the accelerated performance and scale requirements of analytics workloads.

Storage systems that maximize the performance potential of solid state drives (SSDs) and the efficiency and scalability of hard disk drives (HDDs) provide the flexibility and configurability to meet a wide range of application workloads.

Your applications should also drive the essential architecture of your storage system, whether directly connected or networked, whether required to store and deliver data as blocks, files, objects or all three, and whether the storage system must efficiently support a wide range of workloads while prioritizing the performance of the most demanding applications.

Consideration should be given to your overall IT data management architecture to support the scalability, data protection, and business continuity assurance required for your enterprise, spanning from core data centers to those distributed at or near the edge and endpoints of your enterprise operations, and integration with your cloud-resident applications, compute and data storage services and resources.

Ben Gitenstein, VP of Product Management, Qumulo

select data storage solutionWhen searching for the right data storage solution to support your organizational needs today and in the future, it’s important to select a solution that is trusted, scalable to secure demanding workloads of any size, and ensures optimal performance of applications and workloads both on premises and in complex, multi- cloud environments.

With the recent pandemic, organizations are digitally transforming faster than ever before, and leveraging the cloud to conduct business. This makes it more important than ever that your storage solution has built in tools for data management across this ecosystem.

When evaluating storage options, be sure to do your homework and ask the right questions. Is it a trusted provider? Would it integrate well within my existing technology infrastructure? Your storage solution should be easy to manage and meet the scale, performance and cloud requirements for any data environment and across multi-cloud environments.

Also, be sure the storage solution gives IT control in how they manage storage capacity needs and delivers real-time insight into analytics and usage patterns so they can make smart storage allocation decisions and maximize an organizations’ storage budget.

David Huskisson, Senior Solutions Manager, Pure Storage

select data storage solutionData backup and disaster recovery features are critically important when selecting a storage solution for your business, as now no organization is immune to ransomware attacks. When systems go down, they need to be recovered as quickly and safely as possibly.

Look for solutions that offer simplicity in management, can ensure backups are viable even when admin credentials are compromised, and can be restored quickly enough to greatly reduce major organizational or financial impact.

Storage solutions that are purpose-built to handle unstructured data are a strong place to start. By definition, unstructured data means unpredictable data that can take any form, size or shape, and can be accessed in any pattern. These capabilities can accelerate small, large, random or sequential data, and consolidate a wide range of workloads on a unified fast file and object storage platform. It should maintain its performance even as the amount of data grows.

If you have an existing backup product, you don’t need to rip and replace it. There are storage platforms with robust integrations that work seamlessly with existing solutions and offer a wide range of data-protection architectures so you can ensure business continuity amid changes.

Tunio Zafer, CEO, pCloud

select data storage solutionBear in mind: your security team needs to assist. Answer these questions to find the right solution: Do you need ‘cold’ storage or cloud storage? If you’re looking to only store files for backup, you need a cloud backup service. If you’re looking to store, edit and share, go for cloud storage. Where are their storage servers located? If your business is located in Europe, the safest choice is a storage service based in Europe.

Best case scenario – your company is going to grow. Look for a storage service that offers scalability. What is their data privacy policy? Research whether someone can legally access your data without your knowledge or consent. Switzerland has one of the strictest data privacy laws globally, so choosing a Swiss-based service is a safe bet. How is your data secured? Look for a service that offers robust encryption in-transit and at-rest.

Client-side encryption means that your data is secured on your device and is transferred already encrypted. What is their support package? At some point, you’re going to need help. A data storage service with a support package that’s included for free, answers in up to 24 hours is preferred.

Cybersecurity practices are becoming more formal, security teams are expanding

Organizations are building confidence that their cybersecurity practices are headed in the right direction, aided by advanced technologies, more detailed processes, comprehensive education and specialized skills, a research from CompTIA finds.

cybersecurity practices

Eight in 10 organizations surveyed said their cybersecurity practices are improving.

At the same time, many companies acknowledge that there is still more to do to make their security posture even more robust. Growing concerns about the number, scale and variety of cyberattacks, privacy considerations, a greater reliance on data and regulatory compliance are among the issues that have the attention of business and IT leaders.

Elevating cybersecurity

Two factors – one anticipated, the other unexpected – have contributed to the heightened awareness about the need for strong cybersecurity measures.

“The COVID-19 pandemic has been the primary trigger for revisiting security,” said Seth Robinson, senior director for technology analysis at CompTIA. “The massive shift to remote work exposed vulnerabilities in workforce knowledge and connectivity, while phishing emails preyed on new health concerns.”

Robinson noted that the pandemic accelerated changes that were underway in many organizations that were undergoing the digital transformation of their business operations.

“This transformation elevated cybersecurity from an element within IT operations to an overarching business concern that demands executive-level attention,” he said. “It has become a critical business function, on par with a company’s financial procedures.”

As a result, companies have a better understanding of what do about cybersecurity. Nine in 10 organizations said their cybersecurity processes have become more formal and more critical.

Two examples are risk management, where companies assess their data and their systems to determine the level of security that each requires; and monitoring and measurement, where security efforts are continually tracked and new metrics are established to tie security activity to business objectives.

IT teams foundational skills

The report also highlights how the “cybersecurity chain” has expanded to include upper management, boards of directors, business units and outside firms in addition to IT personnel in conversations and decisions.

Within IT teams, foundational skills such as network and endpoint security have been paired with new skills, including identity management and application security, that have become more important as cloud and mobility have taken hold.

On the horizon, expect to see skills related to security monitoring and other proactive tactics gain a bigger foothold. Examples include data analysis, threat knowledge and understanding the regulatory landscape.

Cybersecurity insurance is another emerging area. The report reveals that 45% of large companies, 41% of mid-sized firms and 37% of small businesses currently have a cyber insurance policy.

Common coverage areas include the cost of restoring data (56% of policy holders), the cost of finding the root cause of a breach (47%), coverage for third-party incidents (43%) and response to ransomware (42%).

NIST crowdsourcing challenge aims to de-identify public data sets to protect individual privacy

NIST has launched a crowdsourcing challenge to spur new methods to ensure that important public safety data sets can be de-identified to protect individual privacy.

NIST crowdsourcing challenge

The Differential Privacy Temporal Map Challenge includes a series of contests that will award a total of up to $276,000 for differential privacy solutions for complex data sets that include information on both time and location.

Critical applications vulnerability

For critical applications such as emergency planning and epidemiology, public safety responders may need access to sensitive data, but sharing that data with external analysts can compromise individual privacy.

Even if data is anonymized, malicious parties may be able to link the anonymized records with third-party data and re-identify individuals. And, when data has both geographical and time information, the risk of re-identification increases significantly.

“Temporal map data, with its ability to track a person’s location over a period of time, is particularly helpful to public safety agencies when preparing for disaster response, firefighting and law enforcement tactics,” said Gary Howarth, NIST prize challenge manager.

“The goal of this challenge is to develop solutions that can protect the privacy of individual citizens and first responders when agencies need to share data.”

Protecting PII

Differential privacy provides much stronger data protection than anonymity; it’s a provable mathematical guarantee that protects personally identifiable information (PII).

By fully de-identifying data sets containing PII, researchers can ensure data remains useful while limiting what can be learned about any individual in the data regardless of what third-party information is available.

The individual contests that make up the challenge will include a series of three “sprints” in which participants develop privacy algorithms and compete for prizes, as well as a scoring metrics development contest (A Better Meter Stick for Differential Privacy Contest) and a contest designed to improve the usability of the solvers’ source code (The Open Source and Development Contest).

The Better Meter Stick for Differential Privacy Contest will award a total prize purse of $29,000 for winning submissions that propose novel scoring metrics by which to assess the quality of differentially private algorithms on temporal map data.

The three Temporal Map Algorithms sprints will award a total prize purse of $147,000 over a series of three sprints to develop algorithms that preserve data utility of temporal and spatial map data sets while guaranteeing privacy.

The Open Source and Development Contest will award a total prize purse of $100,000 to teams leading in the sprints to increase their algorithm’s utility and usability for open source audiences.

70% of consumers would cut ties with doctors over unprotected health data

There are growing privacy concerns among Americans due to COVID-19 with nearly 70 percent citing they would likely sever healthcare provider ties if they found that their personal health data was unprotected, a CynergisTek survey reveals.

unprotected health data

Privacy concerns

And as many employers seek to welcome staff back into physical workplaces, nearly half (45 percent) of Americans expressed concerns about keeping personal health information private from their employer.

“With the enactment of key regulations including CCPA and GDPR, we are seeing the convergence of security and privacy come to the forefront at national, state and corporate levels.

“As healthcare systems and corporations continue to grapple with data challenges associated with COVID-19 – whether that’s more sophisticated, targeted cyber-attacks or the new requirements around interoperability and data sharing, concerns around personal data and consumer awareness of privacy rights will only continue to grow,” said Caleb Barlow, president and CEO of CynergisTek.

Patients contemplate cutting ties over unprotected health data

While many still assume personal data is under lock and key, 18 percent of Americans are beginning to question whether personal health data is being adequately protected by healthcare providers. In fact, 47.5 percent stated they were unlikely to use telehealth services again should a breach occur, sounding the alarm for a burgeoning telehealth industry predicted to be worth over $260B by 2026.

While 3 out of 4 Americans still largely trust their data is properly protected by their healthcare provider, tolerance is beginning to wane with 67 percent stating they would change providers if it was found that their data was not properly protected. When drilling deeper into certain age groups and health conditions, the survey also found that:

  • Gen X (73 percent) and Millennials (70 percent) proved even less tolerant compared to other demographics when parting ways with their providers due to unprotected health data.
  • 66 percent of Americans living with chronic health conditions stated they would be willing to change up care providers should their data be compromised.

Data shows that health systems who have not invested the time, money and resources to keep pace with the ever-changing threat landscape are falling behind. Of the nearly 300 healthcare facilities assessed, less than one half met NIST Cybersecurity Framework guidelines.

Concern about sharing COVID-19 health data upon returning to work

As pressures mount for returning employees to disclose COVID-19 health status and personal interactions, an increasing conflict between ensuring public health safety and upholding employee privacy is emerging.

This is increasingly evident with 45 percent stating a preference to keep personal health information private from their employer, shining a light on increased scrutiny among employees with over 1 in 3 expressing concerns about sharing COVID-19 specific health data, e.g. temperature checks. This highlights that office openings may prove more complicated than anticipated.

“The challenges faced by both healthcare providers and employers during this pandemic have seemed insurmountable at times, but the battle surrounding personal health data and privacy is a challenge we must rise to,” said Russell P. Branzell, president and CEO of the College of Healthcare Information Management Executives.

“With safety and security top of mind for all, it is imperative that these organizations continue to take the necessary steps to fully protect this sensitive data from end to end, mitigating any looming cyberthreats while creating peace of mind for the individual.”

Beyond unwanted employer access to personal data, the survey found that nearly 60 percent of respondents expressed anxieties around their employer sharing personal health data externally to third parties such as insurance companies and employee benefit providers without consent.

A stark contrast to Accenture’s recent survey which found 62 percent of C-suite executives confirmed they were exploring new tools to collect employee data. A reminder to employers to tread lightly when mandating employee health protocols and questionnaires.

“COVID-19 has thrown many curveballs at both healthcare providers and employers, and the privacy and protection of critical patient and employee data must not be ignored,” said David Finn, executive VP of strategic innovation of CynergisTek.

“By getting ahead of the curve and implementing system-wide risk posture assessments and ensuring employee opt-in/opt-out functions when it comes to sharing personal data, these organizations can help limit these privacy and security risks.”

MatRiCT: A quantum-safe and privacy-preserving blockchain protocol

Researchers from CSIRO’s Data61 and the Monash Blockchain Technology Centre have developed the world’s most efficient blockchain protocol that is both secure against quantum computers and protects the privacy of its users and their transactions.

MatRiCT

The technology can be applied beyond cryptocurrencies, such as digital health, banking, finance and government services, as well as services which may require accountability to prevent illegal use.

The protocol — a set of rules governing how a blockchain network operates — is called MatRiCT.

Cryptocurrencies vulnerable to attacks by quantum computers

The cryptocurrency market is currently valued at more than $325 billion, with an average of approximately $50 billion traded daily over the past year.

However, blockchain-based cryptocurrencies like Bitcoin and Ethereum are vulnerable to attacks by quantum computers, which are capable of performing complex calculations and processing substantial amounts of data to break blockchains, in significantly faster times than current computers.

“Quantum computing can compromise the signatures or keys used to authenticate transactions, as well as the integrity of blockchains themselves,” said Dr Muhammed Esgin, lead researcher at Monash University and Data61’s Distributed Systems Security Group. “Once this occurs, the underlying cryptocurrency could be altered, leading to theft, double spend or forgery, and users’ privacy may be jeopardised.

“Existing cryptocurrencies tend to either be quantum-safe or privacy-preserving, but for the first time our new protocol achieves both in a practical and deployable way.”

The MatRiCT protocol is based on hard lattice problems, which are quantum secure, and introduces three new key features: the shortest quantum-secure ring signature scheme to date, which authenticates activity and transactions using only the signature; a zero-knowledge proof method, which hides sensitive transaction information; and an auditability function, which could help prevent illegal cryptocurrency use.

Blockchain challenged by speed and energy consumption

Speed and energy consumption are significant challenges presented by blockchain technologies which can lead to inefficiencies and increased costs.

“The protocol is designed to address the inefficiencies in previous blockchain protocols such as complex authentication procedures, thereby speeding up calculation efficiencies and using less energy to resolve, leading to significant cost savings,” said Dr Ron Steinfeld, associate professor, co-author of the research and a quantum-safe cryptography expert at Monash University.

“Our new protocol is significantly faster and more efficient, as the identity signatures and proof required when conducting transactions are the shortest to date, thereby requiring less data communication, speeding up the transaction processing time, and reducing the amount of energy required to complete transactions.”

“Hcash will be incorporating the protocol into its own systems, transforming its existing cryptocurrency, HyperCash, into one that is both quantum safe and privacy protecting,” said Dr Joseph Liu, associate professor, Director of Monash Blockchain Technology Centre and HCash Chief Scientist.

85% of COVID-19 tracking apps leak data

71% of healthcare and medical apps have at least one serious vulnerability that could lead to a breach of medical data, according to Intertrust.

COVID-19 tracking apps leak data

The report investigated 100 publicly available global mobile healthcare apps across a range of categories—including telehealth, medical device, health commerce, and COVID-tracking—to uncover the most critical mHealth app threats.

Cryptographic issues pose one of the most pervasive and serious threats, with 91% of the apps in the study failing one or more cryptographic tests. This means the encryption used in these medical apps can be easily broken by cybercriminals, potentially exposing confidential patient data, and enabling attackers to tamper with reported data, send illegitimate commands to connected medical devices, or otherwise use the application for malicious purposes.

Bringing medical apps security up to speed

The study’s overall findings suggest that the push to reshape care delivery under COVID-19 has often come at the expense of mobile application security.

“Unfortunately, there’s been a history of security vulnerabilities in the healthcare and medical space. Things are getting a lot better, but we still have a lot of work to do.” said Bill Horne, General Manager of the Secure Systems product group and CTO at Intertrust.

“The good news is that application protection strategies and technologies can help healthcare organizations bring the security of their apps up to speed.”

The report on healthcare and medical mobile apps is based on an audit of 100 iOS and Android applications from healthcare organizations worldwide. All 100 apps were analyzed using an array of static application security testing (SAST) and dynamic application security testing (DAST) techniques based on the OWASP mobile app security guidelines.

COVID-19 tracking apps leak data

Report highlights

  • 71% of tested medical apps have at least one high level security vulnerability. A vulnerability is classified as high if it can be readily exploited and has the potential for significant damage or loss.
  • The vast majority of medical apps (91%) have mishandled and/or weak encryption that puts them at risk for data exposure and IP (intellectual property) theft.
  • 34% of Android apps and 28% of iOS apps are vulnerable to encryption key extraction.
  • The majority of mHealth apps contain multiple security issues with data storage. For instance, 60% of tested Android apps stored information in SharedPreferences, leaving unencrypted data readily readable and editable by attackers and malicious apps.
  • When looking specifically at COVID-tracking apps, 85% leak data.
  • 83% of the high-level threats discovered could have been mitigated using application protection technologies such as code obfuscation, tampering detection, and white-box cryptography.

On Executive Order 12333

On Executive Order 12333

Mark Jaycox has written a long article on the US Executive Order 12333: “No Oversight, No Limits, No Worries: A Primer on Presidential Spying and Executive Order 12,333“:

Abstract: Executive Order 12,333 (“EO 12333”) is a 1980s Executive Order signed by President Ronald Reagan that, among other things, establishes an overarching policy framework for the Executive Branch’s spying powers. Although electronic surveillance programs authorized by EO 12333 generally target foreign intelligence from foreign targets, its permissive targeting standards allow for the substantial collection of Americans’ communications containing little to no foreign intelligence value. This fact alone necessitates closer inspection.

This working draft conducts such an inspection by collecting and coalescing the various declassifications, disclosures, legislative investigations, and news reports concerning EO 12333 electronic surveillance programs in order to provide a better understanding of how the Executive Branch implements the order and the surveillance programs it authorizes. The Article pays particular attention to EO 12333’s designation of the National Security Agency as primarily responsible for conducting signals intelligence, which includes the installation of malware, the analysis of internet traffic traversing the telecommunications backbone, the hacking of U.S.-based companies like Yahoo and Google, and the analysis of Americans’ communications, contact lists, text messages, geolocation data, and other information.

After exploring the electronic surveillance programs authorized by EO 12333, this Article proposes reforms to the existing policy framework, including narrowing the aperture of authorized surveillance, increasing privacy standards for the retention of data, and requiring greater transparency and accountability.

Large vendor ecosystems and low visibility increase third-party cyber risk

80% of organizations experienced a cybersecurity breach that originated from vulnerabilities in their vendor ecosystem in the past 12 months, and the average organization had been breached in this way 2.7 times, according to a BlueVoyant survey.

budgets third-party cyber risk

The research also found organizations are experiencing multiple pain points across their cyber risk management program as they aim to mitigate risk across a network that typically encompasses 1409 vendors.

The study was conducted by Opinion Matters and recorded the views and experiences of 1505 CIOs, CISOs and Chief Procurement Officers in organizations with more than 1000 employees across a range of vertical sectors including business and professional services, financial services, healthcare and pharmaceutical, manufacturing, utilities and energy. It covered five countries: USA, UK, Mexico, Switzerland and Singapore.

Third-party cyber risk budgets and other key findings

  • 29% say they have no way of knowing if cyber risk emerges in a third-party vendor
  • 22.5% monitor their entire supply chain
  • 32% only re-assess and report their vendor’s cyber risk position either six-monthly or less frequently
  • The average headcount in internal and external cyber risk management teams is 12
  • 81% say that budgets for third-party cyber risk management are increasing, by an average figure of 40%

Commenting on the research findings, Jim Penrose, COO BlueVoyant, said: “That four in five organizations have experienced recent cybersecurity breaches originating in their vendor ecosystem is of huge concern.

“The research clearly indicated the reasons behind this high breach frequency: only 23% are monitoring all suppliers, meaning 77% have limited visibility and almost one-third only re-assess their vendors’ cyber risk position six-monthly or annually. That means in the intervening period they are effectively flying blind to risks that could emerge at any moment in the prevailing cyber threat environment.”

Multiple pain points exist in third-party cyber risk programs as budgets rise

Further insight into the difficulties that are leading to breaches was revealed when respondents were asked to identify the top three pain points related to their third-party cyber risk programs, in the past 12 months.

The most common problems were:

  • Managing the volume of alerts generated by the program
  • Working with suppliers to improve security performance, and
  • Prioritizing which risks are urgent and which are not.

However, overall responses were almost equally spread across thirteen different areas of concern. In response to these issues, budgets for third-party cyber risk programs are set to rise in the coming year. 81% of survey respondents said they expect to see budgets increase, by 40% on average.

Jim Penrose continues: “The fact that cyber risk management professionals are reporting difficulties across the board shows the complexity they face in trying to improve performance.

“It is encouraging that budget is being committed to tackling the problem, but with so many issues to solve many organizations will find it hard to know where to start. Certainly, the current approach is not working, so simply trying to do more of the same will not shift the dial on third-party cyber risk.”

Variation across industry sectors

Analysis of the responses from different commercial sectors revealed considerable variations in their experiences of third-party cyber risk. The business services sector is suffering the highest rate of breaches, with 89% saying they have been breached via a weakness in a third-party in the past 12 months.

The average number of incidents experienced in the past 12 months was also highest in this sector, at 3.6. This is undoubtedly partly down to the fact that firms in the sector reported working with 2572 vendors, on average.

In contrast, only 57% of respondents from the manufacturing sector said they had suffered third-party cyber breaches in the past 12 months. The sector works with 1325 vendors on average, but had a much lower breach frequency, at 1.7.

“Thirteen percent of respondents from the manufacturing sector also reported having no pain points in their third-party cyber risk management programs, a percentage more than twice as high as any other sector.

Commenting on the stark differences observed between sectors, Jim Penrose said: “This underlines that there is no one-size-fits-all solution to managing third-party cyber risk.

“Different industries have different needs and are at varying stages of maturity in their cyber risk management programs. This must be factored into attempts to improve performance so that investment is directed where it has the greatest impact.”

Mix of tools and tactics in play

The survey investigated the tools organizations have in place to implement third-party cyber risk management and found a mix of approaches with no single approach dominating.

Many organizations are evolving towards a data-driven strategy, with supplier risk data and analytics in use by 40%. However static, point-in-time tactics such as on-site audits and supplier questionnaires remain common.

Jim Penrose concludes: “Overall the research findings indicate a situation where the large scale of vendor ecosystems and the fast-changing threat environment is defeating attempts to effectively manage third-party cyber risk in a meaningful way.

“Visibility into such a large and heterogenous group of vendors is obscured due to lack of resources and a continuing reliance on manual, point-in-time processes, meaning real-time emerging cyber risk is invisible for much of the time.

“For organizations to make meaningful progress in managing third-party cyber risk and reduce the current concerning rate of breaches, they need to be pursuing greater visibility across their vendor ecosystem and achieving better context around alerts so they can be prioritized, triaged and quickly remediated with suppliers.”

iOS 14: New privacy and security features

Apple has released iOS 14, with a bucketload of new and improved functional features and a handful of privacy and security ones.

iOS 14 privacy security

New privacy and security features in iOS 14

The new iOS will tell you when an app is using your camera or microphone

It will show an indicator dot (green for when camera or camera+microphone is in use, orange for microphone) in the top right part of the device’s screen.

iOS 14 privacy security

The downside is that it’s fairly small and you might miss it if other things are happening on the screen. The upside is that you can check which app most recently used your camera or microphone via the Control Center.

Of course, you can deny access to your camera and microphone to any app through the Privacy settings.

You can share with apps your approximate location instead of the precise one

Go to Settings > Privacy and Location Services > Location Services, and you can configure for each app whether you want it to access your device’s location “only while the app is in use”, “always”, “never”, or you want the app to ask you for permission each time you run it (then you get the option to give it permission to access your location “Only once”).

When you allow location access for an app, you’ll get the option to provide your precise location or leave it to the app to determine your approximate location (the latter is good enough for apps that show local news or weather).

You can choose to share with apps just some photos

Under Privacy > Photos you can see which apps have requested access to your photos and you can choose to restrict each app’s access just to selected photos or photo albums (or none).

You can limit tracking

Each time you connect to a Wi-Fi network your phone will show a different MAC address. This is to prevent ISPs and advertisers to track your movements (i.e., see when and where you connect to a network), and this option is on by default.

In Settings > Privacy > Tracking, you can choose to not allow apps to send you a request to track you. If you do that, “any app that attempts to ask you for your permission will be blocked from asking and automatically informed that you have requested not to be tracked. In addition, all apps, other than those that you have previously given permission to track, will be blocked from accessing the device’s Advertising Identifier.”

If you allow tracking, tracking permissions can also be controlled on a per-app basis.

It has to be pointed out, though, that these app tracking options will start working as intended in early 2021, when these privacy controls become mandatory for developers.

“We want to give developers the time they need to make the necessary changes, and as a result, the requirement to use this tracking permission will go into effect early next year,” Apple explained.

Facebook complained earlier this year that these new privacy requirements would have a significant negative impact on its advertising business.

You will be able to see a summary of an app’s privacy practices before you download it from the App Store

You still can’t see these because app developers have yet to roll them out, but when they are ready, you’ll be able to peruse these summaries through a “App Privacy” button on the listing in the store, and they will look something like this:

iOS 14 privacy security

You’ll be able to see which tracking cookies have been blocked

The Safari mobile browser has been updated to show a Privacy Report, which shows all the cross-site tracking cookies it has blocked in the last 30 days if you turned on Prevent Cross-Site Tracking in Safari’s Privacy and Security Settings.

The report is accessible from the AA menu in the browser’s address bar.

You’ll be notified if a password you stored in the iCloud Keychain has been spotted in a known data breach

To turn this option on, go to Settings > Passwords > Security Recommendations and toggle on Detect Compromised Passwords. For the secure password monitoring to work, iCloud Keychain has to be enabled.

Fixed vulnerabilities

In iOS 14, Apple has also fixed a number of security vulnerabilities, including:

  • A vulnerability in an integrated drive electronics (IDE) component that could allow a remote authenticated attacker to execute arbitrary code on a paired device during a debug session over the network (CVE-2020-9992), and a
  • A logic issue affecting the sandbox that may allow a malicious application to access restricted files (CVE-2020-9968)

Secure data sharing in a world concerned with privacy

The ongoing debate surrounding privacy protection in the global data economy reached a fever pitch with July’s “Schrems II” ruling at the European Court of Justice, which struck down the Privacy Shield – a legal mechanism enabling companies to transfer personal data from the EU to the US for processing – potentially disrupting the business of thousands of companies.

The plaintiff, Austrian privacy advocate Max Schrems, claimed that US privacy legislation was insufficiently robust to prevent national security and intelligence authorities from acquiring – and misusing – Europeans’ personal data. The EU’s top court agreed, abolishing the Privacy Shield and requiring American companies that exchange data with European partners to comply with the standards set out by the GDPR, the EU’s data privacy law.

Following this landmark ruling, ensuring the secure flow of data from one jurisdiction to another will be a significant challenge, given the lack of an international regulatory framework for data transfers and emerging conflicts between competing data privacy regulations.

This comes at a time when the COVID-19 crisis has further underscored the urgent need for collaborative international research involving the exchange of personal data – in this case, sensitive health data.

Will data protection regulations stand in the way of this and other vital data sharing?

The Privacy Shield was a stopgap measure to facilitate data-sharing between the US and the EU which ultimately did not withstand legal scrutiny. Robust, compliant-by-design tools beyond contractual frameworks will be required in order to protect individual privacy while allowing data-driven research on regulated data and business collaboration across jurisdictions.

Fortunately, innovative privacy-enhancing technologies (PETs) can be the stable bridge connecting differing – and sometimes conflicting – privacy frameworks. Here’s why policy alone will not suffice to resolve existing data privacy challenges – and how PETs can deliver the best of both worlds:

A new paradigm for ethical and secure data sharing

The abolition of the Privacy Shield poses major challenges for over 5,000 American and European companies which previously relied on its existence and must now confront a murky legal landscape. While big players like Google and Zoom have the resources to update their compliance protocols and negotiate legal contracts between transfer partners, smaller innovators lack these means and may see their activities slowed or even permanently halted. Privacy legislation has already impeded vital cross-border research collaborations – one prominent example is the joint American-Finnish study regarding the genomic causes of diabetes, which “slowed to a crawl” due to regulations, according to the head of the US National Institutes of Health (NIH).

One response to the Schrems II ruling might be expediting moves towards a federal data privacy law in the US. But this would take time: in Europe, over two years passed between the adoption of GDPR and its enforcement. Given that smaller companies are facing an immediate legal threat to their regular operations, a federal privacy law might not come quickly enough.

Even if such legislation were to be approved in Washington, it is unlikely to be fully compatible with GDPR – not to mention widening privacy regulations in other countries. The CCPA, the major statewide data protection initiative, is generally considered less stringent than GDPR, meaning that even CCPA-compliant businesses would still have to adapt to European standards.

In short, the existing legislative toolbox is insufficient to protect the operations of thousands of businesses in the US and around the world, which is why it’s time for a new paradigm for privacy-preserving data sharing based on Privacy-Enhancing Technologies.

The advantages of privacy-enhancing technologies

Compliance costs and legal risks are prompting companies to consider an innovative data sharing method based on PETs: a new genre of technologies which can help them bridge competing privacy frameworks. PETs are a category of technologies that protect data along its lifecycle while maintaining its utility, even for advanced AI and machine learning processes. PETs allow their users to harness the benefits of big data while protecting personally identifiable information (PII) and other sensitive information, thus maintaining stringent privacy standards.

One such PET playing a growing role in privacy-preserving information sharing is Homomorphic Encryption (HE), a technique regarded by many as the holy grail of data protection. HE enables multiple parties to securely collaborate on encrypted data by conducting analysis on data which remains encrypted throughout the process, never exposing personal or confidential information. Through HE, companies can derive the necessary insights from big data while protecting individuals’ personal details – and, crucially, while remaining compliant with privacy legislation because the data is never exposed.

Jim Halpert, a data regulation lawyer who helped draft the CCPA and is Global Co-Chair of the Data Protection, Privacy and Security practice at DLA Piper, views certain solutions based on HE as effective compliance tools.

“Homomorphic Encryption encrypts data elements in such a way that they cannot identify, describe or in any way relate to a person or household. As a result, homomorphically encrypted data cannot be considered ‘personal information’ and is thus exempt from CCPA requirements,” Halpert says. “Companies which encrypt data through HE minimize the risk of legal threats, avoid CCPA obligations, and eliminate the possibility that a third party could mistakenly be exposed to personal data.”

The same principle applies to GDPR, which requires any personally identifiable information to be protected.

HE is applicable to any industry and activity which requires sensitive data to be analyzed by third parties; for example, research such as genomic investigations into individuals’ susceptibility to COVID-19 and other health conditions, and secure data analysis in the financial services industry, including financial crime investigations across borders and institutions. In these cases, HE enables users to legally collaborate across different jurisdictions and regulatory frameworks, maximizing data value while minimizing privacy and compliance risk.

PETs will be crucial in allowing data to flow securely even after the Privacy Shield has been lowered. The EU and the US have already entered negotiations aimed at replacing the Privacy Shield, but while a palliative solution might satisfy business interests in the short term, it won’t remedy the underlying problems inherent to competing privacy frameworks. Any replacement would face immediate legal challenges in a potential “Schrems III” case. Tech is in large part responsible for the growing data privacy quandary. The onus, then, is on tech itself to help facilitate the free flow of data without undermining data protection.

Former NSA Director Keith Alexander Joins Amazon’s Board of Directors

About Bruce Schneier

I am a public-interest technologist, working at the intersection of security, technology, and people. I’ve been writing about security issues on my blog since 2004, and in my monthly newsletter since 1998. I’m a fellow and lecturer at Harvard’s Kennedy School and a board member of EFF. This personal website expresses the opinions of neither of those organizations.

Improving privacy of a global genomic data sharing network

A Case Western Reserve University computer and data sciences researcher is working to shore up privacy protections for people whose genomic information is stored in a vast global collection of vital, personal data.

genomic data sharing network

Erman Ayday pursued novel methods for identifying and analyzing privacy vulnerabilities in the genomic data sharing network known commonly as “the Beacons.”

Personal genomic data refers to each person’s unique genome, his or her genetic makeup, information that can be gleaned from DNA analysis of a blood test, or saliva sample.

Ayday plans to identify weaknesses in the beacons’ infrastructure, developing more complex algorithms to protect against people or organizations who share his ability to figure out one person’s identity or sensitive genomic information using publicly available information.

Doing that will also protect the public – people who voluntarily shared their genomic information to hospitals where they were treated – with the understanding that their identity or sensitive information would not be revealed.

“While the shared use of genomics data is valuable to research, it is also potentially dangerous to the individual if their identity is revealed,” Ayday said. “Someone else knowing your genome is power – power over you. And, generally, people aren’t really aware of this, but we’re starting to see how genomic data can be shared, abused.”

Other research has shown that “if someone had access to your genome sequence–either directly from your saliva or other tissues, or from a popular genomic information service – they could check to see if you appear in a database of people with certain medical conditions, such as heart disease, lung cancer, or autism.”

Human genomic research

Genomics may sometimes be confused or conflated with genetics, but the terms refer to related, but different fields of study:

  • Genetics refers to the study of genes and the way that certain traits or conditions are passed down from one generation to another. It involves scientific studies of genes and their effects. Genes (units of heredity) carry the instructions for making proteins, which direct the activities of cells and functions of the body.
  • Genomics is a more recent term that describes the study of all of a person’s genes (the genome), including interactions of those genes with each other and with the person’s environment. Genomics includes the scientific study of complex diseases such as heart disease, asthma, diabetes, and cancer because these diseases are typically caused more by a combination of genetic and environmental factors than by individual genes.

There has been an ever-growing cache of genomic information since the conclusion of the Human Genome Project in 2003, the 13-year-long endeavor to “discover all the estimated 20,000-25,000 human genes and make them accessible for further biological study” as well as complete the DNA sequencing of 3 billion DNA subunits for research.

Popular genealogy sites such as Ancestry.com and 23andMe rely on this information–compared against their own accumulation of genetic information and analyzed by proprietary algorithms–to discern a person’s ancestry, for example.

Ayday said companies, government organizations and others are also tapping into genomic data. “The military can check the genome of recruits, insurance companies can check whether someone has a predisposition to a certain disease,” he said. “There are plenty of real life examples already.”

Scientists researching genomics are accessing shared DNA data considered critical to advance biomedical research. To access the shared data, researchers send digital requests (“queries”) to certain beacons, each specializing in different genetic mutations.

What are ‘the Beacons?’

The Beacon Network is an array of about 100 data repositories of human genome coding, coordinated by the Global Alliance for Genomics & Health (in collaboration with a Europe-based system called ELIXIR).

And while “queries do not return information about single individuals,” according to the site, a scientific study in 2015 revealed that someone could infer the membership of a particular beacon by sending that site an excessive number of queries.

“And then we used a more sophisticated algorithm and showed that you don’t need thousands of queries,” Ayday said. “We did it by sending less than 10.”

Then, in a follow-up study, Ayday and team showed that someone could also reconstruct the entire genome for an individual with the information from only a handful of queries.

“That’s a big problem,” Ayday said. “Information about one, single individual should not be that easily found out.”

Safari 14: New privacy and security features

Apple has released Safari 14, which features many functional improvements, a Privacy Report that shows all the trackers the browser has neutralized, and and does not support Adobe Flash anymore.

Safari 14 privacy

New features

Safari 14 sports a redesign of the tab bar, which now displays site favicons by default and previews of the contents of some pages (when the user hovers over a tab), and a customizable start page.

It also features improved extension support, as Apple has already put things in motion to allow app developers to easily convert their existing extension into a Safari web extension or build a new one, and support for.

But on to the Safari 14 privacy and security additions:

Privacy Report

The Privacy Report shows the cross-site trackers that Intelligent Tracking Prevention (ITP) prevented from accessing identifying information, and how many and which trackers the visited websites sport. It also shows which entity is behind each tracker.

ITP uses on-device machine learning to identify and block the trackers, and known trackers are independently verified by DuchDuckGo. Safari blocks trackers only if the “Prevent cross-site tracking” option is turned on, and the Privacy Report can only be compiled if users have turned ITP on.

Safari 14 privacy

The report is accessible through the “Safari” tab, via the start page, and via the shield-style icon to the left of the browser’s address bar.

Secure password monitoring

Safari 14 will notify users when one of their saved passwords in iCloud Keychain has shown up in a data breach (iCloud Keychain has to be enabled, of course).

It will also allow them to immediately change the password by pointing them to the correct page for each website (if the admin has specified the page’s URL in the web server’s .well-known directory).

Removed support for Adobe Flash for improved security

Adobe Flash has been a thorn in security-minded users’ and cybersecurity professionals’ side for many years, as its vulnerabilities were often exploited by attackers.

Three years ago, browser makers have announced that they would drop Flash support by the end of 2020, and now the time has come for the move. Adobe Flash will reach end-of-life on December 31, 2020.

Fixed vulnerabilities

Apple has fixed four WebKit vulnerabilities in Safari 14. All can be triggered by the browser processing maliciously crafted web content and three could lead to arbitrary code execution.

More information about and a PoC for the one discovered by Marcin “Icewall” Noga of Cisco Talos can be found here.

Mobile messengers expose billions of users to privacy attacks

Popular mobile messengers expose personal data via discovery services that allow users to find contacts based on phone numbers from their address book, according to researchers.

mobile messengers privacy

When installing a mobile messenger like WhatsApp, new users can instantly start texting existing contacts based on the phone numbers stored on their device. For this to happen, users must grant the app permission to access and regularly upload their address book to company servers in a process called mobile contact discovery.

A recent study by a team of researchers from the Secure Software Systems Group at the University of Würzburg and the Cryptography and Privacy Engineering Group at TU Darmstadt shows that currently deployed contact discovery services severely threaten the privacy of billions of users.

Utilizing very few resources, the researchers were able to perform practical crawling attacks on the popular messengers WhatsApp, Signal, and Telegram. The results of the experiments demonstrate that malicious users or hackers can collect sensitive data at a large scale and without noteworthy restrictions by querying contact discovery services for random phone numbers.

Attackers are enabled to build accurate behavior models

For the extensive study, the researchers queried 10% of all US mobile phone numbers for WhatsApp and 100% for Signal. Thereby, they were able to gather personal (meta) data commonly stored in the messengers’ user profiles, including profile pictures, nicknames, status texts and the “last online” time.

The analyzed data also reveals interesting statistics about user behavior. For example, very few users change the default privacy settings, which for most messengers are not privacy-friendly at all.

The researchers found that about 50% of WhatsApp users in the US have a public profile picture and 90% a public “About” text. Interestingly, 40% of Signal users, which can be assumed to be more privacy concerned in general, are also using WhatsApp, and every other of those Signal users has a public profile picture on WhatsApp.

Tracking such data over time enables attackers to build accurate behavior models. When the data is matched across social networks and public data sources, third parties can also build detailed profiles, for example to scam users.

For Telegram, the researchers found that its contact discovery service exposes sensitive information even about owners of phone numbers who are not registered with the service.

Which information is revealed during contact discovery and can be collected via crawling attacks depends on the service provider and the privacy settings of the user. WhatsApp and Telegram, for example, transmit the user’s entire address book to their servers.

More privacy-concerned messengers like Signal transfer only short cryptographic hash values of phone numbers or rely on trusted hardware. However, the research team shows that with new and optimized attack strategies, the low entropy of phone numbers enables attackers to deduce corresponding phone numbers from cryptographic hashes within milliseconds.

Moreover, since there are no noteworthy restrictions for signing up with messaging services, any third party can create a large number of accounts to crawl the user database of a messenger for information by requesting data for random phone numbers.

“We strongly advise all users of messenger apps to revisit their privacy settings. This is currently the most effective protection against our investigated crawling attacks,” agree Prof. Alexandra Dmitrienko (University of Würzburg) and Prof. Thomas Schneider (TU Darmstadt).

Impact of research results: Service providers improve their security measures

The research team reported their findings to the respective service providers. As a result, WhatsApp has improved their protection mechanisms such that large-scale attacks can be detected, and Signal has reduced the number of possible queries to complicate crawling.

The researchers also proposed many other mitigation techniques, including a new contact discovery method that could be adopted to further reduce the efficiency of attacks without negatively impacting usability.

Do Californians use CCPA to protect their privacy?

Californians regularly opt-out of companies selling their personal information, with “Do-not-sell” being the most common CCPA right exercised, happening nearly 50% of the time over access and deletion requests, DataGrail’s Mid-Year CCPA Trends Report shows.

CCPA use

Consumer rights under CCPA

The California Consumer Privacy Act gives California residents the right to:

  • Know what personal data businesses have about them
  • Know what businesses do with that information (to whom they sell it or disclose it)
  • Access their personal data
  • Refuse the sale of their personal data
  • Request that a business deletes their personal data

Do-not-sell requests are almost 50% of all DSRs

When CCPA went into effect in January 2020, DataGrail saw people exercise their rights immediately, with a surge of data subject requests (DSRs) going across its platform in January 2020.

Since the initial surge, DSRs have stabilized around 13 DSRs per million records every month, which is a substantial rate and confirms that organizations need an established privacy program.

Consumers are accessing their data (21%), deleting their data (31%) and requiring that businesses do-not-sell their personal information (48%).

Processing DSRs

Gartner data shows that manually processing a single DSR costs on average $1,406. At this rate, organizations can expect to spend almost $240,000 per million records to fulfill DSRs – if they are done manually.

Additionally, organizations could be on the hook for more DSR requests from fines that will likely begin appearing in October, if CCPA follows the same timeline as GDPR.

According to the research, B2C companies should prepare to process approximately 170 total DSRs per one million consumer records each year.

DataGrail has also found that three of every ten DSRs will go unverified, confirming the need for a robust and scalable verification method to prevent fraud (i.e., detect fraudulent requests being made to steal personal data).

Access requests (DSARs) make up 70% of the unverified requests, validating the concern that nefarious characters could be submitting access requests to gain access to another person’s personal information.

Recommendations to enhance subscriber privacy in 5G

There are clear benefits of 5G SIM capabilities to protect the most prominent personal data involved in mobile communications, according to the Trusted Connectivity Alliance.

subscriber privacy 5G

Addressing privacy risks

The IMSI, known as a Subscription Permanent Identifier (SUPI) in 5G, is the unique identifier allocated to an individual SIM by an MNO. Despite representing highly personal information, the IMSI is exposed to significant security vulnerabilities as it is sent unencrypted over-the-air in 2G, 3G and 4G technologies.

Most notably, ‘IMSI catchers’ are readily and inexpensively available and can be used to illegally monitor a subscriber’s location, calls and messages.

“To address the significant privacy risks posed by IMSI catchers, the 5G standards introduced the possibility for MNOs to encrypt the IMSI before it is sent over-the-air,” comments Claus Dietze, Chair of Trusted Connectivity Alliance.

“But as the standards state that encryption can be performed either by the SIM or by the device, and even be deactivated, there is potential for significant variability in terms of implementation. This creates scenarios where the IMSI is not sufficiently protected and the subscriber’s personal data is potentially exposed.”

Managing IMSI encryption within the 5G SIM

Given these scenarios, the white paper recommends that MNOs consider limiting the available implementation options to rely on proven, certified solutions. Of the available options, executing IMSI encryption within the 5G SIM, which refers to both the SIM or eSIM as defined by Trusted Connectivity Alliance as the Recommended 5G SIM, emerges as a comprehensive solution when examined against a range of key criteria. This includes ownership and control, the security of the SIM and its production process, and certification and interoperability.

“Eurosmart fully supports the Trusted Connectivity Alliance position on subscriber privacy encryption, and agrees it should be managed within the 5G SIM. If we consider the direct impact on the security and resilience of critical infrastructures and essential services, and the requirements of the NIS directives, it is also apparent that a robust regulatory response is warranted to support these recommendations,” adds Philippe Proust, President of Eurosmart.

“We therefore contend that regulatory measures should be implemented to define an ad hoc security certification scheme addressing IMSI encryption within the 5G SIM under the EU Cybersecurity Act. In addition, it should be a requirement for the IMSI to be encrypted within the 5G SIM, and for the 5G SIM to be mandatorily security certified to demonstrate its capabilities.”

Claus concludes: “Managing IMSI encryption within the 5G SIM delivers control, best-in-class security and interoperability to prevent malicious and unlawful interception. And with 5G creating a vast array of new use-cases, SIM-based encryption is the only viable way to establish interoperability across emerging consumer and industrial IoT use-cases and, ultimately, enable a secure connected future.”

Data protection critical to keeping customers coming back for more

Although consumers remain concerned about sharing personal data with companies, the results of a Privitar survey highlight an opportunity for businesses to take a leadership role and build brand loyalty by protecting their customers.

customers data protection

The report found that more than three-quarters of respondents are concerned or very concerned about protecting their personal data, with 42 percent of consumers saying they wouldn’t share sensitive data (e.g. name, address, email address, phone number, location information, health information, banking information, social security number) with a business for any reason.

As consumers grow increasingly apprehensive when it comes to their data, business success will depend on an organizations’ ability to prioritize and successfully execute on privacy initiatives.

Disconnect between consumer sentiment and actions surrounding data protection

When it comes to the management of their data, many consumers aren’t fully aware of how brands are securing their personal information. According to the survey, 43 percent of consumers don’t know if they’ve worked with a business that has been impacted by a data breach.

When it comes to privacy notices, 28 percent admit to not reading privacy notices at all and 42 percent admitted to only skimming the text. These findings point to a growing sentiment that data privacy should be the responsibility of the business – not the customer. With this, businesses have a tremendous opportunity to make data privacy a differentiator and way to build long-term loyalty.

Pandemic creating more data sharing opportunities, still consumers are wary

Despite the growing advancements on the data protection front, 51 percent of consumers surveyed said they are still not comfortable sharing their personal information. One-third of respondents said they are most concerned about it being stolen in a breach, with another 26 percent worried about it being shared with a third party.

In the midst of the growing pandemic, COVID-19 tracking, tracing, containment and research depends on citizens opting in to share their personal data. However, the research shows that consumers are not interested in sharing their information.

When specifically asked about sharing healthcare data, only 27 percent would share health data for healthcare advancements and research. Another 21 percent of consumers surveyed would share health data for contact tracing purposes.

As data becomes more valuable to combat the pandemic, companies must provide consumers with more background and reasoning as to why they’re collecting data – and how they plan to protect it.

Upcoming U.S. elections driving consumer awareness of data privacy

Recently, there’s been increased conversation about data privacy and protection legislation across the United States, especially as the CCPA is enacted and the CDPSA awaits its fate in Congress.

As the debate grows louder across the nation, 73 percent of consumers think that there should be more government oversight at the federal and/or state/local levels. While legislation can take years to pass, it’s important for businesses to overhaul their technology and processes now to quickly address consumers’ concerns and keep business running.

Businesses must drive data privacy action

Companies rely on brand loyalty to keep their operations up and running. While often referring to affordable costs and personalization as a means to keeping business moving, many overlook the importance of instilling a more personal sense of trust within their customer base.

When working with a business, 40 percent of consumers think the brand’s trustworthiness is most important when it comes to brand loyalty and 31 percent say it’s the brand’s commitment to protecting their data.

Evenly matched up with the 30 percent of consumers who believe customer service matters most, the results prove that data protection is just as critical to keeping customers coming back for more.

However, broken trust and lost responsibility for protecting that data have severe consequences, with 24 percent saying they have either stopped doing business or done less business with a company after it was breached.

As markets grow increasingly competitive in a fluctuating economy, it’s critical for businesses to keep customer loyalty high – and as such, be more open and transparent with how they’re using personal data.

“The global COVID-19 pandemic has underscored the importance of the trust relationship companies and governments need to build with consumers in an increasingly digital world,” said Jason du Preez, CEO, Privitar.

“The results of the survey affirm the growing need for brands to focus on building and maintaining this trust, starting first and foremost with protecting customer data. As more businesses utilize the cloud to enable data driven insights, a firm commitment to data privacy will help to ensure long-term loyalty, consumer satisfaction and shareholder value.”

Facing gender bias in facial recognition technology

In the 1960s, Woodrow W. Bledsoe created a secret program that manually identified points on a person’s face and compared the distances between these coordinates with other images.

facial recognition bias

Facial recognition technology has come a long way since then. The field has evolved quickly and software can now automatically process staggering amounts of facial data in real time, dramatically improving the results (and reliability) of matching across a variety of use cases.

Despite all of the advancements we’ve seen, many organizations still rely on the same algorithm used by Bledsoe’s database – known as “k-nearest neighbors” or k-NN. Since each face has multiple coordinates, a comparison of these distances over millions of facial images requires significant data processing. The k-NN algorithm simplifies this process and makes matching these points easier by considerably reducing the data set. But that’s only part of the equation. Facial recognition also involves finding the location of a feature on a face before evaluating it. This requires a different algorithm such as HOG (we’ll get to it later).

The problem

The algorithms used for facial recognition today rely heavily on machine learning (ML) models, which require significant training. Unfortunately, the training process can result in biases in these technologies. If the training doesn’t contain a representative sample of the population, ML will fail to correctly identify the missed population.

While this may not be a significant problem when matching faces for social media platforms, it can be far more damaging when the facial recognition software from Amazon, Google, Clearview AI and others is used by government agencies and law enforcement.

Previous studies on this topic found that facial recognition software suffers from racial biases, but overall, the research on bias has been thin. The consequences of such biases can be dire for both people and companies. Further complicating matters is the fact that even small changes to one’s face, hair or makeup can impact a model’s ability to accurately match faces. If not accounted for, this can create distinct challenges when trying to leverage facial recognition technology to identify women, who generally tend to use beauty and self-care products more than men.

Understanding sexism in facial recognition software

Just how bad are gender-based misidentifications? Our team at WatchGuard conducted some additional facial recognition research, looking solely at gender biases to find out. The results were eye-opening. The solutions we evaluated was misidentifying women 18% more often than men.

You can imagine the terrible consequences this type of bias could generate. For example, a smartphone relying on face recognition could block access, a police officer using facial recognition software could mistakenly identify an innocent bystander as a criminal, or a government agency might call in the wrong person for questioning based on a false match. The list goes on. The reality is that the culprit behind these issues is bias within model training that creates biases in the results.

Let’s explore how we uncovered these results.

Our team performed two separate tests – first using Amazon Rekognition and the second using Dlib. Unfortunately, with Amazon Rekognition we were unable to unpack just how their ML modeling and algorithm works due to transparency issues (although we assume it’s similar to Dlib). Dlib is a different story, and uses local resources to identify faces provided to it. It comes pretrained to identify the location of a face, and with face location finder HOG, a slower CPU-based algorithm, and CNN, a faster algorithm making use of specialized processors found in a graphics cards.

Both services provide match results with additional information. Besides the match found, a similarity score is given that shows how close a face must match to the known face. If the face on file doesn’t exist, a similarity score set to low may incorrectly match a face. However, a face can have a low similarity score and still match when the image doesn’t show the face clearly.

For the data set, we used a database of faces called Labeled Faces in the Wild, and we only investigated faces that matched another face in the database. This allowed us to test matching faces and similarity scores at the same time.

Amazon Rekognition correctly identified all pictures we provided. However, when we looked more closely at the data provided, our team saw a wider distribution of the similarities in female faces than in males. We saw more female faces with higher similarities then men and more female faces with less similarities than men (this actually matches a recent study performed around the same time).

What does this mean? Essentially it means a female face not found in the database is more likely to provide a false match. Also, because of the lower similarity in female faces, our team was confident that we’d see more errors in identifying female faces over male if given enough images with faces.

Amazon Rekognition gave accurate results but lacked in consistency and precision between male and female faces. Male faces on average were 99.06% similar, but female faces on average were 98.43% similar. This might not seem like a big variance, but the gap widened when we looked at the outliers – a standard deviation of 1.64 for males versus 2.83 for females. More female faces fall farther from the average then male faces, meaning female false match is far more likely than the 0.6% difference based on our data.

Dlib didn’t perform as well. On average, Dlib misidentified female faces more than male, leading to an average rate of 5% more misidentified females. When comparing faces using the slower HOG, the differences grew to 18%. Of interest, our team found that on average, female faces have higher similarity scores then male when using Dlib, but like Amazon Rekognition, also have a larger spectrum of similarity scores leading to the low results we found in accuracy.

Tackling facial recognition bias

Unfortunately, facial recognition software providers struggle to be transparent when it comes to the efficacy of their solutions. For example, our team didn’t find any place in Amazon’s documentation in which users could review the processing results before the software made a positive or negative match.

Unfortunately, this assumption of accuracy (and lack of context from providers) will likely lead to more and more instances of unwarranted arrests, like this one. It’s highly unlikely that facial recognition models will reach 100% accuracy anytime soon, but industry participants must focus on improving their effectiveness nonetheless. Knowing that these programs contain biases today, law enforcement and other organizations should use them as one of many tools – not as a definitive resource.

But there is hope. If the industry can honestly acknowledge and address the biases in facial recognition software, we can work together to improve model training and outcomes, which can help reduce misidentifications not only based on gender, but race and other variables, too.

Confirmed: Browsing histories can be used to track users

Browsing histories can be used to compile unique browsing profiles, which can be used to track users, Mozilla researchers have confirmed.

Browser histories track users

There are also many third parties pervasive enough to gather web histories sufficient to leverage browsing history as an identifier.

The research

This is not the first time that researchers have demonstrated that browsing profiles are distinctive and stable enough to be used as identifiers.

Sarah Bird, Ilana Segall and Martin Lopatka were spurred to reproduce the results set forth in a 2012 paper by Lukasz Olejnik, Claude Castelluccia, and Artur Janc, by using more refined data, and they’ve extend that work to detail the privacy risk posed by the aggregation of browsing histories.

The Mozillians collected browsing data from ~52,000 Firefox for 7 calendar days, then paused for 7 days, and then resumed for an additional 7 days. After analyzing the collected data, they identified 48,919 distinct browsing profiles, of which 99% are unique. (The original paper observed a set of ~400,000 web history profiles, of which 94% were unique.)

“High uniqueness holds even when histories are truncated to just 100 top sites. We then find that for users who visited 50 or more distinct domains in the two-week data collection period, ~50% can be reidentified using the top 10k sites. Reidentifiability rose to over 80% for users that browsed 150 or more distinct domains,” they noted.

The also confirmed that browsing history profiles are stable through time – a second prerequisite for these profiles being repeatedly tied to specific users/consumers and used for online tracking.

“Our reidentifiability rates in a pool of 1,766 were below 10% for 100 sites despite a >90% profile uniqueness across datasets, but increased to ~80% when we consider 10,000 sites,” they added.

Finally, some corporate entities like Alphabet (Google) and Facebook are able to observe the web to an even greater extent that when the research for the 2012 paper was conducted, which may allow them to gain deep visibility into browsing activity and use that visibility for effective online tracking – even if users use different devices to browse the internet.

Browser histories track users

Other recent research has shown that anonymization of browsing patterns/profile through generalization does not sufficiently protect users’ anonymity.

Regulation is needed

Privacy researcher Lukasz Olejnik, one of the authors of the 2012 paper, noted that the findings of this newest research are a welcome confirmation that web browsing histories are personal data that can reveal insight about the user or be used to track users.

“In some ways, browsing history resemble biometric-like data due to their uniqueness and stability,” he commented, and pointed out that, since this data allows the singling-out of individuals out of many, it automatically comes under the General Data Protection Regulation (GDPR).

“Web browsing histories are private data, and in certain contexts, they are personal data. Now the state of the art in research indicates this. Technology should follow. So too should the regulations and standards in the data processing. As well as enforcement,” he concluded.

The state of GDPR compliance in the mobile app space

Among the rights bestowed upon EU citizens by the General Data Protection Regulation (GDPR) is the right to access their personal data stored by companies (i.e., data controllers) and information about how this personal data is being processed. A group of academics from three German universities has decided to investigate whether and how mobile app vendors respond to subject access requests, and the results of their four-year undercover field study are dispiriting.

The results of the study

“In three iterations between 2015 and 2019, we sent subject access requests to vendors of 225 mobile apps popular in Germany. Throughout the iterations, 19 to 26 % of the vendors were unreachable or did not reply at all. Our subject access requests were fulfilled in 15 to 53 % of the cases, with an unexpected decline between the GDPR enforcement date and the end of our study,” they shared.

“The remaining responses exhibit a long list of shortcomings, including severe violations of information security and data protection principles. Some responses even contained deceptive and misleading statements (7 to 13 %). Further, 9 % of the apps were discontinued and 27 % of the user accounts vanished during our study, mostly without proper notification about the consequences for our personal data.”

GDPR mobile

The researchers – Jacob Leon Kröger from TU Berlin (Weizenbaum Institute), Jens Lindemann from the University of Hamburg, and Prof. Dr. Dominik Herrmann from the University of Bamberg – made sure to test a representative sample of iOS and Android apps: popular and less popular, from a variety of app categories, and from vendors based in Germany, the EU, and outside of the EU.

They disguised themselves as an ordinary German user, created accounts needed for the apps to work, interacted with each app for about ten minutes, and asked app providers for information about their stored personal data (before and after GDPR enforcement).

They also used different a request text for each round of inquiries. The first one was more informal, while the last two were more elaborate and included references to relevant data protection laws and a warning that the responsible data protection authorities would be notified in the case of no response.

“While we cannot precisely determine their individual influence, it can be assumed that both the introduction of the GDPR as well as the more formal and threatening tone of our inquiry in [the latter two inquiries] had an impact on the vendors’ behavior,” they noted.

Solving the problem

Smartphones are ubiquitous and most users use a variety of mobile apps, which usually collect personal user data and share it with third parties.

In theory, the GDPR should force mobile app vendors to provide information about this data and how it’s used to users. In practice, though, many app vendors are obviously hoping that users won’t care enough about it and won’t make a stink when they don’t receive a satisfactory reply, and that GDPR regulators won’t have the resources to enforce the regulation.

“We (…) suspected that some vendors merely pretended to be poorly reachable when they received subject access requests – while others actually had insufficient resources to process incoming emails,” the researchers noted.

“To confirm this hypothesis, we tested how the vendors that failed to respond to our requests reacted to non-privacy related inquiries. Using another (different) fake identity, we emailed the vendors who had not replied [to the first inquiry] and [to the third inquiry], expressing interest in promoting their apps on a personal blog or YouTube channel. Out of the group of initial non-responders, 31 % [first inquiry] and 22 % [third inquiry] replied to these dummy requests, many of them within a few hours, proving that their email inbox was in fact being monitored.”

The researchers believe the situation for users can be improved by authorities doing random compliance checks and offering better support for data controllers through industry-specific guidelines and best practices.

“In particular, there should be mandatory standard interfaces for providing data exports and other privacy-related information to data subjects, obviating the need for the manual processing of GDPR requests,” they concluded.