Data versioning and auditing is one of the most important components of almost every software system. Changes done by users must be tracked to be able to review the history of a dataset or track back who changed what dataset. Data versioning is mostly implemented on application level and consumes a majority of the development effort. To reduce this effort and the attached risk we implemented a storage engine that performs automatic data versioning. With the help of this engine no additional code is needed on application side to perform the data versioning. Furthermore, it is possible to recreate a table as it was to a specific point in time by simply setting a specific session variable. In this task you will implement a complete physical table restore. Currently we implemented the retrieval of a specific version of a data row by scanning all passing rows. When multiple queries are executed against the same table then currently the table is reconstructed for every query. This behavior should be changed that the table is physically reconstructed and kept in this state until the user changes the session variables to change the version. This task includes the following subtasks:
- Get acquainted with the revision engine architecture
- Analyze the table structure for
- Design and implement the physical table reconstruction
- Implement cleanup procedure of physical table
- Perform benchmark test to analyze the benefits
Optional if time still is available:
- Hide all versioning tables to achieve a higher transparent behavior
- Add additional code to satisfy the performance schema
For this work you will need knowledge of the C and C++ programming language. All work must be published under the GPLv3 license.
If you have any questions please contact
peter@ddengine.org