Technology - Databases - What is ACID

Databases store data in a structured, searchable format that can be accessed in various ways. They are essential for both business and consumer applications alike.

The most widely-used database type is a relational one. This stores data in tables, columns, and indices.

Databases must adhere to ACID (Atomicity, Consistency, Isolation, and Durability) principles in order to protect data integrity. These properties protect transactions and guarantee that the database can recover from any failures.

ACID stands for atomicity

ACID stands for Accuracy, Consistency, Isolation, and Durability in database transactions. All these characteristics guarantee that data within a database remains consistent and accurate.

Atomicity is the property of a database transaction to occur in an independent and irreducible series of operations. This can be achieved through complex mechanisms like journaling or logging and operating-system calls.

Databases often rely on locking to guarantee each transaction is atomic. Locking defines the data that each transaction needs access to, and only permits one modification at a time until it completes or fails. Two-phase locking can often provide full isolation by guaranteeing no other changes are made before yours has been finalized.

If a transaction isn’t atomic, it could take longer to finish because the database must rollback and start over if any part fails. This reduces efficiency and leads to worse performance.

Another potential issue is media failure, when a disk or other storage device becomes inaccessible due to an error. This type of malfunction could lead to corrupted data or even the complete loss of databases.

Atomicity can help mitigate the consequences of such failures by keeping data consistent and uncorrupted. Additionally, it prevents data from becoming inaccessible if the system crashes.

Atomicity is the ideal state in which each transaction occurs as a discrete, complete unit. Without atomicity, transactions could take longer to complete, and data in databases could become inaccurate or corrupted.

Atomicity is the goal of every database transaction and can be achieved through techniques such as locking. A database that ensures atomicity will be able to withstand power outages or other failures within its system or operating system, whether due to a bug in the DBMS code or hardware malfunction.

Consistency

Data consistency refers to the ability for a database to ensure all data points within it are aligned with an established set of values. Data failing to meet these standards can have serious repercussions for the entire system.

Consistency rules are created by the database developer and must be followed when writing data into the system. These may include constraints, cascades, triggers, and variables. If a transaction alters data that does not adhere to these guidelines it will be rolled back and the process aborted.

Another way to guarantee consistency is through declarative constraints, which prevent any data changes from bringing data into an illegal state. For instance, if all customer accounts must have positive balances, any changes that would bring them into negative balance will be rolled back and the process aborted.

Consistency is essential for guaranteeing data in a database is accurate and retrievable by clients, especially when processing large volumes of information.

Consider a bank customer who transfers funds between accounts. While this may seem like a small amount, it can have significant effects on the outcome of an application or service.

If this step were not done properly, a customer could end up with an account balance of zero. This causes major confusion and typically takes considerable effort to rectify.

In such cases, weak consistency can lead to errors that cause downtime and frustration for users as well as a poor brand image.

Consistency is becoming an increasingly critical requirement for technology database systems. It’s an inherent property of modern systems that developers cannot afford to neglect.

Weak consistency can be a real issue when a distributed database has hundreds of data nodes spread across multiple regions. This can significantly affect response times, scalability, and availability.

Isolation

Isolation is the property of a database system that prevents reads and writes from impacting other transactions within that same database. This ensures data accuracy and prevents modifications by other transactions. Isolation plays an essential role in any database system.

Isolating transactions helps to eliminate concurrency-related side effects such as dirty reads and lost updates. Determining the appropriate isolation level depends on your application’s data integrity requirements.

A lower isolation level increases the number of concurrent users that can access data simultaneously, but also allows more potential conflicts and side effects caused by concurrency. Selecting the optimal isolation level requires careful consideration of your application’s data access code and requirements.

The SQL standard defines four levels of isolation. From read uncommitted at the lowest level to serializable at the highest, these restrictions must be adhered to for data safety and integrity.

At the lowest level of isolation, read uncommitted, a transaction can access data that has been modified but not committed by other transactions. While this provides some protection from dirty reads or non-repeatable reads, it does not prevent them completely.

At the intermediate isolation level of read committed, transactions can only read data already committed by other transactions. This helps prevent dirty reads and allows one transaction to add new entries into the database.

However, this level also compromises concurrency. This is because it uses locks to prevent phantom reads but cannot prevent a transaction from adding new entries to the database while performing other operations (for instance, writing).

Serializable, the highest isolation level, appears as a serial execution and prevents all three read phenomena. Additionally, range locks–which act as an intermediary lock between row and table locking–help prevent phantom reads.

Aside from avoiding concurrency side effects, a higher isolation level reduces the amount of overhead each user experiences when accessing data. This performance boost can be significant; however, perfect isolation is difficult and usually comes at a cost.

Durability

Accuracy, Consistency, Isolation and Durability (ACID properties) are essential features in technology databases. These characteristics, also referred to as DBMS properties, guarantee that all transactions occur in isolation and the data remains valid and compliant with your defined constraints.

Databases that implement ACID properties guarantee all transactions will remain in the system even in case of a crash or power outage. This means any updates or modifications made to the database remain after they have been completed and saved to permanent storage, such as disks, so they can be reconstructed in case of failure.

ACID properties are essential in safeguarding your database against corruption or loss due to an accident, power outage, or other unanticipated events. Furthermore, they offer an extra level of security that lets you rest easy knowing your data is correct and accurate.

Durability, which ensures a committed transaction remains in the system after it is finished and not lost when the database crashes, is one of the most essential properties. Without durability, databases that make changes to their memory such as RAM would lose that information if an unexpected event like a power outage occurred.

Databases typically utilize write-ahead logs, which store an archive of all modifications made to the database. When a system crashes, these logs can be utilized to resume processing the transaction.

Additionally, the transaction log is flushed to disk before a transaction commits in order to reduce contention for log IO and increase throughput. This process, called sp_flush_log, can be controlled with the DELAYED_DURABILITY setting on the database.

The DELAYED_DURABILITY setting takes precedence over all commit-level settings. With this setting, all committed delayed durability transactions will be saved to disk if a fully durable transaction is executed against any table (durable memory-optimized or disk-based) in the database or sp_flush_log is called successfully.

Related References