Misconfigured or unsecured databases exposed on the open web are a fact of life. We hear about some of them because security researchers tell us how they discovered them, pinpointed their owners and alerted them, but many others are found by attackers first.
It used to take months to scan the Internet looking for open systems, but attackers now have access to free and easy-to-use scanning tools that can find them in less than an hour.
“There’s no way to leave unsecured data online without opening the data up to attack. This is why it’s crucial to always enable security and authentication features when setting up databases, so that your organization avoids this risk altogether.”
What do attackers do with exposed databases?
Bressers has been involved in the security of products and projects – especially open-source – for a very long time. In the past two decades, he created the product security division at Progeny Linux Systems and worked as a manager of the Red Hat product security team and headed the security strategy in Red Hat’s Platform Business Unit.
He now manages bug bounties, penetration testing and security vulnerability programs for Elastic’s products, as well as the company’s efforts to improve application security, add new and improve existing security features as needed or requested by customers.
The problem with exposed Elasticsearch (MariaDB, MongoDB, etc.) databases, he says, is that they are often left unsecured by developers by mistake and companies don’t discover the exposure quickly.
“The scanning tools do most of the work, so it’s up to the attacker to decide if the database has any data worth stealing,” he noted, and pointed out that this isn’t hacking, exactly – it’s mining of open services.
Attackers can quickly exfiltrate the accessible data, hold it for ransom, sell it to the highest bidder, modify it or simply delete it all.
“Sometimes there’s no clear advantage or motive. For example, this summer saw a string of cyberattacks called the Meow Bot attacks that have affected at least 25,000 databases so far. The attacker replaced the contents of every afflicted database with the word ‘meow’ but has not been identified or revealed anything behind the purpose of the attack,” he explained.
Advice for organizations that use clustered databases
Open-source database platforms such as Elasticsearch have built-in security to prevent attacks of this nature, but developers often disable those features in haste or due to a lack of understanding that their actions can put customer data at risk, Bressers says.
“The most important thing to keep in mind when trying to secure data is having a clear understanding of what you are securing and what it means to your organization. How sensitive is the data? What level of security needs to be applied? Who should have access?” he explained.
“Sometimes working with a partner who is an expert at running a modern database is a more secure alternative than doing it yourself. Sometimes it’s not. Modern data management is a new problem for many organizations; make sure your people understand the opportunities and challenges. And most importantly, make sure they have the tools and training.”
Secondly, he says, companies should set up external scanning systems that continuously check for exposed databases.
“These may be the same tools used by attackers, but they immediately notify security teams when a developer has mistakenly left sensitive data unlocked. For example, a free scanner is available from Shadowserver.”
Elastic offers information and documentation on how to enable the security features of Elasticsearch databases and prevent exposure, he adds and points out that security is enabled by default in their Elasticsearch Service on Elastic Cloud and cannot be disabled.
Defense in depth
No organization will ever be 100% safe, but steps can be taken to decrease a company’s attack surface. “Defense in depth” is the name of the game, Bressers says, and in this case, it should include the following security layers:
- Discovery of data exposure (using the previously mentioned external scanning systems)
- Strong authentication (SSO or usernames/passwords)
- Prioritization of data access (e.g., HR may only need access to employee information and the accounting department may only need access to budget and tax data)
- Deployment of monitoring infrastructures and automated solutions that can quickly identify potential problems before they become emergencies, isolate infected databases, and flag to support and IT teams for next steps
He also advises organizations that don’t have the internal expertise to set security configurations and managing a clustered database to hire of service providers that can handle data management and have a strong security portfolio, and to always have a mitigation plan in place and rehearse it with their IT and security teams so that when something does happen, they can execute a swift and intentional response.
The ever-escalating popularity of Elasticsearch – the distributed open source search and log analytics engine that has become a staple in enterprise application developers’ tool belts – is well-warranted. Elasticsearch security lapses, however, have been a headline-grabbing thorn in the side of the technology.
The distributed document store too often represents a security blind spot for organizations, inexcusably failing to receive the attention and upkeep that other data storage solutions are normally granted. Data breach incidents involving Elasticsearch have been commonly rooted in this lack of attention, as well as a poor overall understanding of Elasticsearch security requirements.
As an open source solution, Elasticsearch can be downloaded without any subscription or enterprise license required. But in its default configuration, Elasticsearch doesn’t come with enterprise-grade security features. This can add up to a perfect storm from a security perspective: Elasticsearch is tremendously easy to deploy, but just as simple to forget about when it comes to hardening security that properly restricts access and protects data.
In a now-all-too-commonly-seen examples, technology teams expose their development or testing systems to the internet for convenience, and then forget to change to a secure configuration before moving Elasticsearch into production. The result – careless exposure of production Elasticsearch data to anyone who might access it – puts organizations at risk.
Elasticsearch security options
Until recently, the best (and, really, the only) viable option for ensuring Elasticsearch security was using the Elastic Stack extension X-Pack. X-Pack requires purchasing a costly enterprise subscription from Elastic. For that cost, X-Pack does provide valuable enterprise-grade security.
However, there is now another option: the Amazon-initiated Open Distro for Elasticsearch project offers a slate of enterprise-grade security features with open source availability. Among these, Open Distro for Elasticsearch includes encryption of data in-transit – supporting OpenSSL and TLS 1.2. This protects both traffic from external clients and internal traffic among cluster nodes, while offering simplified integration with public key infrastructures and the ability to enable enterprises to satisfy strict regulatory compliance requirements.
Open Distro for Elasticsearch readily integrates into authentication infrastructures as well, allowing enterprises to authenticate users through LDAP/Active Directory, Kerberos, SAML, and other popular protocols.
Open Distro for Elasticsearch also includes role-based access controls (RBACs), featuring granular controls for limiting each user’s access to only those cluster operations, indices, or documents and fields they require. It also enables security incident responses and secures the Elasticsearch cluster in-line with government and industry regulations via audit logs. This logging tracks and records all user actions within the cluster and enables all activity to be monitored.
In comparison, Elastic’s X-Pack similarly features SSL/TLS encryption, authorization and access controls including password protection, RBACs, and IP filtering, and the ability to maintain audit trails. While Elastic has also taken the step of opening its code for X-Pack, the clear caveat remains that the required licensing fees make X-Pack the costlier option for securing Elasticsearch.
5 actions enterprises should take to ensure Elasticsearch security
Whichever solution for achieving Elasticsearch security an enterprise selects, the following best practices should be top-of-mind:
1. Encrypt all data. Utilize TLS to encrypt all traffic within your Elasticsearch cluster, as well as all traffic from data sources connecting to your Elasticsearch cluster.
2. Do not expose your Elasticsearch cluster to the internet without the proper precautions. In cases where such exposure is required, ensure that internet-facing servers use secure configurations and leverage firewalls, least-privilege policies and access controls, proxies, etc.
3. Implement strict access controls. Control access to indices, documents, and more with secure authentication methods and RBACs.
4. Introduce audit logs. Utilize audit logging to track the actions of all users within your Elasticsearch cluster, monitor any suspicious activity, and conduct informed security incident responses.
5. Leverage provider support when necessary. If in need of external expertise and support, enlist a managed Elasticsearch provider capable of mitigating your security risks. Such providers can offer out-of-the-box security features such as encryption, access control, and monitoring and alerts, while ensuring the integrity of your data in accordance with regulatory standards.
Safely realizing the full benefits of Elasticsearch – and there are many of them – requires paying close attention to your data security protections, the same as you would with any database implementation. By selecting a suitable security strategy and adhering to best practices, your organization can get the most out of Elasticsearch while still keeping data fully secure.
Siren, the leading Elasticsearch-based investigative intelligence platform, announced the release of Siren 10.4. The latest version of Siren introduces a major breakthrough in big data graph analysis, the capability of finding the “shortest” paths across connected records that are inside Elasticsearch clusters.
“Finding, the shortest and most significant data paths are absolutely critical in sectors such as intelligence, law enforcement, and financial fraud,” said John Randles, CEO at Siren.
“Since version 10.3, Siren efficiently supported this operation for data stored in native graph databases. Now with version 10.4, we enable this for data being stored in Siren’s Elasticsearch nodes and other systems. This opens up capabilities to a massive amount of data that can be used to solve very real-world problems.”
Dr. Giovanni Tummarello, Co-Founder and Chief Product Officer, at Siren, said: “With 10.4, we improve support for graph databases like Neo4J and others, while also allowing users to run certain graph algorithms directly on data living in other backends.
“For our customers and the industry in general, this radically changes the norm, and greatly increases the value of data connected to the Siren platform.”
Siren currently has three pending patents covering aspects of this implementation. The search can operate on complete or filtered data subgraphs.
When running on data residing inside Elasticsearch clusters, the algorithm leverages the distributed join capabilities of Siren’s Federate plugin, while switching to using SQL statements when traversing data on remote JDBC DataSources.
Siren 10.4 also includes a host of new features including support for internationalized UI, improved support for white-labeling and OpenID support. For intelligence and law enforcement customers, 10.4 also adds support for secure cloud environments and the ability to open Siren generated graphs directly in the widely adopted IBM i2 Analyst Notebook.
With respect to future development, Tummarello concludes: “This is the first graph algorithm in Siren that runs directly – and efficiently – on native backends that connect to our platform. Currently, in relation to performance, it delivers on customer expectations.
“But this is only the beginning, fairly soon we expect performance and scalability to further increase, likely in excess of 10x, as we roll out a series of new optimizations and capabilities via the Federate plugin. The performance will also be complemented by several new graph algorithms which will add to this initial one. ”