JPEG Pleno Point Cloud Coding - Deep Learning based coding of Point Clouds

Read time: 4 mins

The context

JPEG has been a very relevant standardisation committee over the years, mostly because of the well-known JPEG standard and also the successful JPEG 2000. Recently, JPEG has been growing its standardisation offer, trying to respond to the new requests and demands created by the recent technological evolution.

The well-known JPEG standard for digital photography is without doubt the most used standard in history. It was defined under the umbrella of ISO, IEC and ITU, three of the most influential standardisation organisations, which obviously contributed to its success. It gained, by its nature, an important place in the world's heritage.

JPEG success is based on a well-defined standardisation process based on:

  1. the definition of use cases and requirements,
  2. the creation of tools for common test conditions,
  3. the call for proposals,
  4. the analysis of the responses to the call for proposals based on the common test conditions,
  5. the definition of a verification model based on the core responses, and
  6. the standard definition, notably the definition of the WD, CD, DIS, FDIS and IS, using the mechanisms of ISO.

The challenges

Recently, JPEG has been particularly active in the definition of deep learning-based codecs. The JPEG AI has already reached the CD stage at the 100th JPEG meeting held in Covilhã, Portugal. This standardisation activity aims to support the development of JPEG Pleno Learning-based Point Cloud Coding.

How standardisation activities help face the challenges

The scope of the JPEG Pleno Point Cloud activity is the creation of a learning-based coding standard for point clouds and associated attributes, offering a single-stream, compact compressed domain representation, supporting advanced flexible data access functionalities. This standard target both interactive human visualisations, with competitive compression efficiency compared to state-of-the art point cloud coding solutions in common use, and effective performance for 3D processing and machine-related computer vision tasks, with the goal of supporting a royalty-free baseline.

The Benefits

This standard is envisioned to provide a number of unique benefits, including a single efficient point cloud representation for both humans and machines. The intent is to provide humans with the ability to visualise and interact with the point cloud geometry and attributes while providing machines with the ability to perform 3D processing and computer vision tasks in the decompressed/reconstructed domain, notably by enforcing error constraints, and in the compressed domain (latents after entropy decoding), notably by enabling lower complexity and higher accuracy through the use of compressed domain features extracted from the original instead of the lossy decoded point cloud.

Future plans

To support the scope above, this activity will advance through a series of stages that shall develop as follows:

Stage 1: A learning-based coding standard addressing human visualisation, and decompressed/reconstructed domain 3D processing and computer vision tasks;

Stage 2: A learning-based coding standard additionally supporting compressed domain 3D processing such as visual enhancement and super-resolution and;

Stage 3: A learning-based coding standard additionally supporting compressed domain computer vision tasks such as classification, recognition and segmentation.

JPEG Pleno Learning-based Point Cloud Coding stage 1 is in the WD stage, and CD is expected at the next meeting (October/November 2023). After an initial verification model devoted to the definition of a joint model for the simultaneous coding of the geometry and texture information of point clouds, the committee opted to create a new model where the geometry is compressed using a deep learning-based architecture and a projection of the texture to the decoded geometry is compressed using the relatively mature JPEG AI with effective gains in compression.

This model is named Verification Model 4 (VM4), and the JPEG Committee experts believe it will be closely related to the final model that will constitute the future CD. The development of the VM4 for JPEG Pleno Learning-based Point Cloud Coding has relied on the quality assessment of static point clouds under coding distortion studies developed by JPEG. Under current development, the defined programme has several contributions, including:

  • Assessment of the call for proposals responses,
  • Testing and evaluation of the different proposals for the verification model's development,
  • Codec Performance Stability Studies,
  • Verification of the Verification models,
  • Redefinition of the Common test conditions, introducing a new testing dataset,
  • Redefinition of the Common test conditions, adding new objective metrics based on different subjective studies either developed internally by JPEG or developed by members.

This activity was effective and crucial in providing the groundwork for the first learning-based codec for the representation of 3D visual information. It allowed the success in the definition of the future CD of stage 1 devoted to the visualisation of the point cloud information and created the grounds for the future stages 2 and 3 for point cloud processing and classification using the latent representation, that will be considered in the near future.

Professor Antonio Pinheiro - Fellow