[REPLAY] Product Talk: Using AI to enhance the data marketplace search experience

Watch the replay
Glossary

Data storage

Data storage is the retention of digital information on recording media, so that it can be accessed by computers or other devices for future use.

What is data storage?

Data storage refers to the retention of digital information on recording media, so that it can be accessed by computers or other devices for future use.

Data storage systems use electromagnetic, optical or other media to store data and can be attached directly to a computer, accessed via a network, or be cloud-based. Normally due to speed and access considerations the most used data is stored closest to the user (such as on their computer hard drive or a direct-attached storage (DAS) device), with less used (and often larger) data stored further away, such as on a network attached storage (NAS) device or within the cloud.

Different data storage methods have different characteristics, such as:

  • Volatility – volatile memory is normally the fastest storage technology, but requires constant power to retain information, as in the case of a PC
  • Mutability – the ability to overwrite information
  • Accessibility – differences in speed when accessing different information
  • Addressability – how information can be located
  • Capacity – the total amount of stored information that a storage device or medium can hold and its density
  • Performance – including latency (the time it takes to access a particular location in storage); throughput (the rate at which information can be read from or written to the storage); granularity (the size of information that can be read in a single unit); and reliability (failure rates)
  • Energy use – how much power a method/device typically consumes, particularly to keep it cool
  • Security – how the device is protected, including encryption

What are the benefits of data storage?

Data storage provides key benefits including:

  • The ability to easily share data between different users, devices and systems, both inside and outside the organization
  • The ability to backup data securely to avoid issues or downtime if the original data is corrupted, lost or subject to a cyberattack
  • The ability to turn data into value. Accessible data storage is vital to bringing data together to gain real value from it and thus drive data democratization
  • The ability to collect and bring together enormous volumes of data enables companies to run comprehensive AI and machine learning projects.

What are the types of data storage?

Data can be recorded and stored in three main forms:

  • File storage. In this method data is stored in files and folders, organized in a hierarchy. This is how data is stored on computer hard drives or cloud-based storage such as Google Drive and Microsoft OneDrive. Files look exactly the same to the user as they do to the hard drive.
  • Block storage (or block-level storage). Data is split into evenly sized blocks and is then stored separately, along with the necessary identifiers to locate and action it. Often used in cloud storage or Storage Area Networks (SANs), blocks can be stored anywhere on the data infrastructure, with identifiers enabling them to be found and accessed quickly.
  • Object storage (or object-level storage). This architecture is designed to handle large amounts of unstructured data, such as email, videos, photos, or web pages – essentially anything that cannot be organized into a traditional relational database.

Where can data be stored?

Data can be stored in a variety of devices and locations, including:

  • A computer’s local hard drive, situated within the machine
  • Devices attached directly to the PC, such as a USB drive, SSD/HDD, CD/DVD drive or tape device
  • Backup appliances, storing copies of a machine’s data for disaster recovery. These can either be permanently connected to a machine, or detachable so that they can be taken off-site to provide a further level of protection in case of fire or business disaster
  • Network-based storage that allows data to be stored away from individual machines, and allows more than one computer to access data. This can be in the form of network attached storage (NAS) devices or storage area networks (SANs), a separate network made up of multiple devices
  • Online, cloud-based storage, accessible by users, such as Google Drive, or Microsoft SharePoint
  • Online, cloud-based storage solutions such as Amazon Web Services, Google Cloud and Microsoft Azure

Most organization’s data architectures store data in a variety of devices and locations, both on-premise and in the cloud.

What are the challenges to effective data storage?

Simply storing data alone does not deliver value – it needs to be easily accessible and understandable to maximize its use. Effective data storage strategies therefore need to focus on:

  • Accessibility, with the ability for all to access relevant data quickly and easily, wherever it is located
  • Understandable data, that is clearly described allowing anyone to use it with confidence
  • Security, protecting data against internal and external threats
  • Governance, limiting access to sensitive or personally identifiable information (PII)
  • Anonymity, ensuring that any sensitive data (especially PII) is anonymized to avoid misuse
  • Compliance, ensuring that all data storage and retention meets relevant regulations such as the GDPR, CCPA and other legislation

Download the ebook making data widely accessible and usable

Learn more
Accelerating public sector data sharing – best practice from Australia Public Sector
Accelerating public sector data sharing – best practice from Australia

Data sharing enables public sector organizations to increase accountability, boost efficiency and meet changing stakeholder needs. Our blog shares use cases from Australia to inspire cities and municipalities around the world

Opendatasoft integrates Mistral AI’s LLM models to provide a multi-model AI approach tailored to client needs Product
Opendatasoft integrates Mistral AI’s LLM models to provide a multi-model AI approach tailored to client needs

To give customers choice when it comes to AI, the Opendatasoft data portal solution now includes Mistral AI's generative AI, alongside its existing deployment of OpenAI's model. As we explain in this blog, this multi-model approach delivers significant advantages for clients, their users, our R&D teams and future innovation.

Using data to drive innovation across the Middle East Data Trends
Using data to drive innovation across the Middle East

The recent GITEX event in Dubai provided the perfect opportunity to understand how data sharing is changing across the Middle East. Based on our discussions, this blog highlights 5 key themes driving data use in the region.

Start creating the best data experiences