Understanding Vertica Epochs

Overview of Epochs

An epoch is 64-bit number that represents a logical time stamp for the data in Vertica. Every row has an implicitly stored column that records the committed epoch.

=> CREATE TABLE testdata (a INT, b INT);
=> INSERT INTO testdata VALUES (1,2);
=> INSERT INTO testdata VALUES (3,4);
=> COMMIT;
=> INSERT INTO testdata VALUES (5,6);
COMMIT;
=> SELECT a,b,epoch FROM testdata;

OUTPUT
--------
      1
(1 row)

 OUTPUT
--------
      1
(1 row)

COMMIT

 OUTPUT
--------
      1
(1 row)

COMMIT
 a | b | epoch
---+---+-------
 1 | 2 |  1898
 3 | 4 |  1898
 5 | 6 |  1899
(3 rows)

The epoch advances when the logical state of the system changes or when the data is committed with a DML operation (INSERT, UPDATE, MERGE, COPY, or DELETE). The EPOCHS system table contains the date and time of each closed epoch and the corresponding epoch number of the closed epoch within the epoch map. This information lets you determine which time periods pertain to which epochs:

=> SELECT * FROM epochs;
      epoch_close_time         | epoch_number
-------------------------------+--------------
 2015-11-09 17:47:21.013641+00 |         1350
 2015-11-09 18:03:30.454707+00 |         1351
 2015-11-09 18:04:48.329494+00 |         1371
 2015-11-09 18:18:42.796702+00 |         1372
 2015-11-09 18:19:52.738018+00 |         1392
 2015-11-09 18:26:43.655577+00 |         1393
 2015-11-11 15:21:47.918074+00 |         1394
 2015-11-11 15:23:16.80952+00  |         1897
 2015-11-20 18:13:20.324436+00 |         1898
 2015-11-20 18:13:20.372756+00 |         1899
(10 rows)		

Types of Epochs

An epoch map is the list of epochs that are between AHM and LE. An epoch map is stored in the database catalog.

Epoch Map.jpg

Different types of epochs are as follows:

Current Epoch (CE)

The current epoch is the open epoch that becomes the last epoch (LE) after a COMMIT operation. The current_epoch at the time of the COMMIT is the epoch for that DML.

=> SELECT CURRENT_EPOCH FROM SYSTEM;
CURRENT_EPOCH 
---------------
629415 

(1 row)

=> INSERT INTO test VALUES (10,20); COMMIT;
OUTPUT 
--------
1
(1 row)

COMMIT
=> SELECT current_epoch FROM SYSTEM;
 current_epoch
---------------
        629416
(1 row)

Latest Epoch (LE)

The latest epoch is the most recently closed epoch. The current epoch after the COMMIT operation becomes the latest epoch.

Checkpoint Epoch (CPE)

The checkpoint epoch per projection is the latest epoch for which there is no data in the WOS. It is the point at which the projection can be recovered. The Tuple Mover moveout operation advances the projection CPE while moving the data from WOS to the ROS. You can see the projection checkpoint epochs in the PROJECTION_CHECKPOINT_EPOCHS system table.

=> SELECT node_name, projection_schema schema, projection_name p_name, is_up_to_date UTD, checkpoint_epoch CPE, would_recover WR, is_behind_ahm BAHM from projection_checkpoint_epochs;

       node_name        |  schema | p_name                   |   UTD |    CPE    | WR    | BAHM
------------------------+---------+--------------------------+-------+-----------+-------+-----
 v_test_israel_node0001 | public  |   product_dimension_b1   |   t   |    1347   |   f   |   f
 v_test_israel_node0003 | public  |   product_dimension_b1   |   t   |    1347   |   f   |   f
 v_test_israel_node0002 | public  |   product_dimension_b1   |   t   |    1347   |   f   |   f
 v_test_israel_node0002 | store   |   store_dimension_b1     |   t   |    1347   |   f   |   f
 v_test_israel_node0003 | store   |   store_dimension_b1     |   t   |    1347   |   f   |   f
 v_test_israel_node0001 | store   |   store_dimension_b1     |   t   |    1347   |   f   |   f
 v_test_israel_node0002 | public  |   promotion_dimension_b1 |   t   |    1347   |   f   |   f
 v_test_israel_node0001 | public  |   promotion_dimension_b1 |   t   |    1347   |   f   |   f
 v_test_israel_node0003 | public  |   promotion_dimension_b1 |   t   |    1347   |   f   |   f
 v_test_israel_node0001 | public  |   warehouse_dimension_b1 |   t   |    1347   |   f   |   f
 v_test_israel_node0002 | public  |   warehouse_dimension_b1 |   t   |    1347   |   f   |   f
 v_test_israel_node0003 | public  |   warehouse_dimension_b1 |   t   |    1347   |   f   |   f 

Last Good Epoch (LGE)

The minimum checkpoint epoch across all the nodes is known as the last good epoch. The last good epoch refers to the most recent epoch that can be recovered in a manual recovery. The LGE consists of a snapshot of all the data on the disk. If the cluster shuts down abnormally, the data after the LGE is lost. The Tuple Mover advances the CPE and sets a new LGE. If the Tuple Mover fails, the data does not move from the WOS to the ROS. Hence, the data does not advance the CPE and the LGE. Each node has an LGE, the minimum checkpoint epoch of the projections on the node. Vertica computes the cluster LGE from node LGEs such that cluster LGE takes advantage of data K-safety to present the highest possible LGE. To see the cluster last good epoch, use the following command:

=> SELECT GET_LAST_GOOD_EPOCH();
 GET_LAST_GOOD_EPOCH
---------------------
                1347
(1 row)

You can verify each node LGE by checking the recovery epoch:

=> SELECT GET_EXPECTED_RECOVERY_EPOCH();
INFO 4544:  Recovery Epoch Computation:
Node Dependencies:
011 - cnt: 253
101 - cnt: 253
110 - cnt: 253
111 - cnt: 239
Nodes certainly in the cluster:
        Node 0(v_test_israel_node0001), epoch 1347
        Node 1(v_test_israel_node0002), epoch 1347
Filling more nodes to satisfy node dependencies:
Data dependencies fulfilled, remaining nodes LGEs don't matter:
        Node 2(v_test_israel_node0003), epoch 1347
--
 GET_EXPECTED_RECOVERY_EPOCH
-----------------------------
                        1347
(1 row)

Ancient History Mark (AHM)

A large epoch map can increase the catalog size. The ancient history mark is the epoch prior to which historical data can be purged from physical storage. You cannot run any historical queries prior to the AHM. By default, Vertica advances the AHM at an interval of 3 minutes to be equal with LGE. When the node in the cluster is down with unrefreshed projections, the AHM does not advance. The AHM is never greater than the LGE. To verify the AHM, use the following command:

=> SELECT GET_AHM_EPOCH();
GET_AHM_EPOCH
---------------
         1347
(1 row)

How Epochs Work

With the COMMIT of a DML transaction (INSERT, UPDATE, MERGE, COPY, and DELETE), both the CE and the LE advance. When the current epoch moves by 1, the LE also moves by 1. The current epoch becomes the latest epoch. Depending on the type of DML transaction, Vertica does the following:

  • If the data is loaded by the INSERT or the COPY statement, each row has an epoch value representing the time the row was committed.
  • If the data is deleted using a DELETE statement, Vertica creates delete vectors that store the epoch. The delete vectors store the position in the ROS container that is marked for deletion.

The following commands show when Vertica creates new epochs:

=> CREATE TABLE test_epochs ( c1 int);
CREATE TABLE

=> SELECT CURRENT_EPOCH, AHM_EPOCH, LAST_GOOD_EPOCH FROM SYSTEM;
 CURRENT_EPOCH | AHM_EPOCH | LAST_GOOD_EPOCH
---------------+-----------+-----------------
          1348 |      1347 |            1347
(1 row)

-- Insert new data
=> INSERT INTO test_epochs VALUES (1);
 OUTPUT
--------
      1
(1 row)

-- Because the INSERT was not committed, no epoch was recorded
=> SELECT epoch, * FROM test_epochs;
 epoch | c1
-------+----
       |  1
(1 row)

=> COMMIT;
COMMIT

-- The COMMIT transaction uses last current epoch and current epoch becomes the latest epoch.

=> SELECT epoch, * FROM test_epochs;
 epoch | c1
-------+----
  1348 |  1
(1 row)

-- Current epoch advances because of the last COMMIT.
=> SELECT CURRENT_EPOCH, AHM_EPOCH, LAST_GOOD_EPOCH FROM SYSTEM;
 CURRENT_EPOCH | AHM_EPOCH | LAST_GOOD_EPOCH
---------------+-----------+-----------------
          1349 |      1347 |            1347
(1 row)

-- In Vertica, UPDATE is DELETE + INSERT. 

=> UPDATE test_epochs SET c1 = 2 WHERE c1 = 1;
 OUTPUT
--------
      1
(1 row)

- Because transaction has not been committed, the epoch column is empty.
=> SELECT epoch, * FROM test_epochs;
 epoch | c1
-------+----
       |  2
(1 row)

=> COMMIT;
COMMIT

- After the COMMIT, the last epoch becomes the last current epoch.
=> SELECT epoch, * FROM test_epochs;
 epoch | c1
-------+----
  1349 |  2
(1 row)

=> SELECT CURRENT_EPOCH, AHM_EPOCH, LAST_GOOD_EPOCH FROM SYSTEM;
 CURRENT_EPOCH | AHM_EPOCH | LAST_GOOD_EPOCH
---------------+-----------+-----------------
          1350 |      1348 |            1348
(1 row)

On executing a SELECT statement, Vertica retrieves data at the latest epoch. Vertica also retrieves uncommitted data in the same transaction.

Any data deleted earlier than the AHM is eligible for removal. For more information about deleting data, see the Best Practices for Deleting Data document.

Troubleshooting Epoch Issues

The following information provides tips about how to troubleshoot any epoch problems.

Last Good Epoch Does Not Advance

There are certain situations when the last good epoch does not advance. If the LGE advances, you see the following result. When the Tuple Mover moves the data from the WOS to the ROS, the LGE advances:

=>SELECT CURRENT_EPOCH, LAST_GOOD_EPOCH FROM SYSTEM ;
 CURRENT_EPOCH   |   LAST_GOOD_EPOCH
---------------+-----------------
        630384   |       620380
  • If you do not see the LGE advance, check if there is data in the WOS:
=>SELECT sum(wos_used_bytes) from projection_storage ;
  sum
-------
 49152 
  • If there is data in the WOS, force a moveout operation:
=> SELECT do_tm_task('moveout');
  • If there is static data, check the Tuple Mover. The Tuple Mover moveout operation fails due to the following reasons:
    • If the Tuple Mover cannot acquire a T-lock on a table.
    • If the Tuple Mover reaches the maximum number of ROS containers, which results in ROS pushback. ROS pushback occurs when the moveout operation cannot create a new container because the maximum number of ROS containers is reached.
    • If you have more than 1024 partitions in the WOS.

To find out why the Tuple Mover is failing, check the vertica.log file.

$ FIND <catalog_path> -name vertica.log   |   xargs grep 'TM Moveout.* <ERROR>'
  • Replay deletes slow the operation of Tuple Mover. To verify the movement of the Tuple Mover, use the following command:
=> SELECT * from tuple_mover_operations where is_executing ;

Ancient History Mark Does Not Advance

There are times that the ancient history marker does not advance. The AHM does not advance in the following scenarios:

  • If there is an unrefreshed projection. To resolve this, export the DDL and refresh the projection. To find about the unrefreshed projection, use the following command:
=> SELECT * FROM projections where is_up_to_date = 'f';
  • If the projection to be refreshed is a large table. Try the workaround described in the Best Practices for Refreshing Large Projections. This workaround makes sure that Vertica does not hold the AHM until the refresh projection operation finishes.
-- When try to advance the AHM, you get an error

=> SELECT MAKE_AHM_NOW();
ERROR 2154:  AHM must lag behind the create epoch of unrefreshed projection public.test_2_b0 (Create epoch: 1372)

-- Export DDL to save if want to deploy later

=> SELECT export_objects('','test_epochs');
.... 

-- Drop projection 

=> drop projection test_2;
DROP PROJECTION

-- Advance the AHM
=> SELECT MAKE_AHM_NOW();
         MAKE_AHM_NOW
-------------------------------
 AHM set (New AHM Epoch: 1393)
(1 row)
  • If all the nodes in the cluster are not UP. Check the status of nodes using the following command:
=> SELECT * FROM nodes where node_state != 'UP';

Vertica Epoch Functions

Vertica offers the following Epoch functions.

Function Name Description

GET_AHM_EPOCH

Returns the epoch number in which the AHM is located.

GET_CURRENT_EPOCH

Returns the current epoch number during which data is written by the COPY, INSERT, UPDATE, and DELETE operations.

GET_LAST_GOOD_EPOCH

Returns the last good epoch number. The last good epoch refers to the recent epoch that can be recovered using manual recovery.

SET_AHM_EPOCH

Sets the AHM to the specified epoch. The SET_AHM_EPOCH is normally used for testing purposes.

For more information, see Epochs in the Vertica documentation.