Pros and Cons of CVS

The compressed value storage (CVS) is new. I say that because I can't find materials about it or something similar to it online. Or perhaps I am searching with the wrong keywords. Basically, my problem is that there is not much to find about how CVS can be used in matrix operations. So, for now, I have to figure things out on my own.


(
    List<int> value, 
    List<List<int>> linearIndex, 
    MatrixLinearIndexMode linearIndexMode,
    (int rows, int columns) size
)

CVS is a lossless compression of matrix values (vs. CRS compression of row indexes or CCS compression of column indexes). CVS is in many ways coordinate-wise. Logic used to perform operations with DOK sparse matrices (see Mendz.Library.Matrices.IDOKSparseMatrixExtensions), can more or less be re-used when performing matrix operations with CVS.

Pros

CVS makes it easy to insert entries to the matrix. First you need to check if the entry's value is new or not. If it's new, add new entries for it in both Value and LinearIndex. If it already exists, you can simply add the new linear index to the value's LinearIndex collection and re-sort.

Removing items from CVS is easy as well. If the value exists, then remove the linear index from its LinearIndex collection. If the value's LinearIndex count is 0, then the value and its linear index list must be removed from Value and LinearIndex respectively.

A characteristic of CVS is that it allows some operations to store the result in place, instead of creating a new CVS instance. For example, Mendz.Library.Matrices.CVS has a Transpose() method that transposes the data in the same CVS instance, altering its Size and its collection of linear indexes without adding much memory use during and after the operation. It also has the SetLinearIndexMode() method which makes it possible to switch the CVS instance's linear index mode (from row-major order to column-major order and vise versa) without adding much memory use during and after the operation.

Transpose() and SetLinearIndexMode() in place operations affect the linear indexes in CVS. One operation that affects the values that can be performed to CVS in place is matrix-scalar multiplication. Because the values in CVS are unique, multiplying them to a scalar value should also return unique values. Thus, it should be possible to affect the list of values without affecting their respective list of linear indexes, and the CVS would remain valid. Note that this is possible if the CVS value and scalar types are the same, thus the result should have the same type (see Mendz.Library.Matrices.CVSExtensions.MatrixScalarProductInPlace() method).

Cons

Although inserting items to CVS seems easy, ensuring its data integrity requires an extra step. The expectation is that the collections of linear indexes in LinearIndex are unique. After inserting a new value, it may be important to make sure that the linear index does not exist anywhere else. If it does, then the other entry needs to be removed. For very large matrices, this can be a slow process.

Problems with Addition/Subtraction

The requirement that the CVS values are non-zero and unique makes addition and subtraction a bit trickier.

As you should find in a later article about this operation, the current approach I have is to loop through the input CVS's to populate a LinearIndexKeyedConcurrentSparseMatrix instance (which can theoretically use 30-50% smaller memory than CoordinatesKeyedConcurrentSparseMatrix). The result sparse matrix is then compressed to CVS and returned as the result. Although concurrent, with a couple of loops, the overall operation requires a huge jump in memory requirement during addition/subtraction.

I am hopeful that a more memory efficient process should be possible. It just needs to get formulated...

Problems with Multiplication

The problems with addition/subtraction are pretty much the same in multiplication, but compounded by the fact that the resulting matrix can be of a different size as well. Managing the resulting CVS values so that they are non-zero and unique, and then aligning them to the correct list of linear indexes in the new matrix is quite a mental exercise.

As you should find in a later article about this operation, the current approach I have is to loop through the input CVS's to populate a LinearIndexKeyedConcurrentSparseMatrix instance (which can theoretically use 30-50% smaller memory than CoordinatesKeyedConcurrentSparseMatrix). The result sparse matrix is then compressed to CVS and returned as the result. Although concurrent, with loops within loops within loops, the overall operation requires a huge jump in memory requirement during multiplication.

I am hopeful that a more memory efficient process should be possible. It just needs to get formulated...

Search This Blog

As I See Tech