| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306 |
- # Terminology
- ## QR Code
- A [*QR code*](https://en.wikipedia.org/wiki/QR_code) (quick-response code) is a type of two-dimensional matrix barcode, invented
- in 1994 by Japanese company [Denso Wave](https://www.qrcode.com/en/faq.html#patentH2Title) for labelling automobile parts.
- The QR labelling system was applied beyond the automobile industry due to its fast readability and greater storage capacity
- compared to standard UPC barcodes.
- QR Codes, more specifically, the popular *Model 2*, are internationally standardized in the ISO/IEC 18004.
- ## Matrix
- A QR symbol is arranged in a *matrix* consisting of an array of nominally square modules arranged in an overall square pattern.
- For ease of reference, module positions are defined by their row and column coordinates in the symbol, in the form `(x, y)`
- where `x` designates the column (counting from left to right) and `y` the row (counting from the top downwards) in which
- the module is located, with counting commencing at 0. Module `(0, 0)` is therefore located in the upper left corner of the symbol.
- ### Module
- A *module* represents a single square "pixel" in the matrix (not to confuse with pixels in a raster image or screen).
- A dark module represents a binary one and a light module represents a binary zero.
- ### Version
- The *version* of a QR symbol determines the side length of the matrix (and therefore the maximum capacity of code words),
- ranging from 21×21 modules (441 total) at version 1 to 177×177 modules (31329 total) at version 40.
- The module count increases in steps of 4 and can be calculated by `4 * version + 17`.
- The maximum capacity for each version, mode and ECC level can be found in [this table (qrcode.com)](https://www.qrcode.com/en/about/version.html).
- ## Function Patterns
- ### Finder Pattern
- The *Finder Pattern* shall consist of three identical Position Detection Patterns located at the upper left, upper right
- and lower left corners of the symbol.
- Each Position Detection Pattern may be viewed as three superimposed concentric squares and is constructed of dark 7×7 modules,
- light 5×5 modules and dark 3×3 modules.
- The symbol is preferentially encoded so that similar patterns have a low probability of being encountered elsewhere in the symbol,
- enabling rapid identification of a possible QR Code symbol in the field of view. Identification of the three Position Detection
- Patterns comprising the finder pattern then unambiguously defines the location and orientation of the symbol in the field of view.
- <p align="center">
- <img alt="Finder pattern" src="">
- </p>
- ### Alignment Pattern
- The *Alignment Pattern* is a fixed reference pattern in defined positions, which enables the decode software to
- resynchronise the coordinate mapping of the modules in the event of moderate amounts of distortion of the image.
- Each Alignment Pattern may be viewed as three superimposed concentric squares and is constructed of dark 5×5
- modules, light 3×3 modules and a single central dark module.
- The number of Alignment Patterns depends on the symbol version, and they shall be placed in all Model 2 symbols of
- version 2 or larger in positions defined in the specification.
- <p align="center">
- <img alt="Alignment Pattern" src="">
- </p>
- ### Timing Pattern
- The horizontal and vertical Timing Patterns respectively consist of a one module wide row or column of alternating
- dark and light modules, commencing and ending with a dark module. The horizontal Timing Pattern runs across
- row 6 of the symbol between the separators for the upper Position Detection Patterns; the vertical Timing Pattern
- similarly runs down column 6 of the symbol between the separators for the left-hand Position Detection Patterns.
- They enable the symbol density and version to be determined and provide datum positions for determining module
- coordinates.
- <p align="center">
- <img alt="Timing Pattern" src="">
- </p>
- ### Separators
- A pattern of all light modules, one module wide, separating the Position Detection Patterns from the rest of the symbol.
- <p align="center">
- <img alt="Separators" src="">
- </p>
- ### Quiet Zone
- This is a region 4 modules wide which shall be free of all other markings, surrounding the symbol on all four sides.
- Its nominal reflectance value shall be equal to that of the light modules.
- <p align="center">
- <img alt="Quiet Zone" src="">
- </p>
- ## Encoding Region
- This region shall contain the symbol characters representing data, those representing error correction codewords,
- the Version Information and Format Information.
- ### Data
- This region contains the encoded data and error correction code blocks. Data bits are placed starting at the bottom-right of
- the matrix and proceeding upward in a column that is 2 modules wide. When the column reaches the top, the next 2-module column
- starts immediately to the left of the previous column and continues downward. Whenever the current column reaches the edge of
- the matrix, move on to the next 2-module column and change direction. If a function pattern or reserved area is encountered,
- the data bit is placed in the next unused module.
- (see [wikipedia QR code - Encoding](https://en.wikipedia.org/wiki/QR_code#Encoding) and [thonky.com - QR Code Tutorial](https://www.thonky.com/qr-code-tutorial/module-placement-matrix#step-6-place-the-data-bits))
- <p align="center">
- <img alt="Data" src="">
- </p>
- ### Version Information
- The Version Information is an 18 bit sequence containing 6 data bits, with 12 error correction bits calculated using the (18, 6)
- [BCH code](https://en.wikipedia.org/wiki/BCH_code) which contains the version number.
- <p align="center">
- <img alt="Version Information" src="">
- </p>
- ### Format Information
- The Format Information is a 15 bit sequence containing 5 data bits, with 10 error correction bits calculated using the (15, 5) BCH code.
- It contains information on the error correction level applied to the symbol and on the masking pattern used,
- essential to enable the remainder of the encoding region to be decoded.
- <p align="center">
- <img alt="Format Information" src="">
- </p>
- ### Darkmodule
- The module in position `(4 * version + 9, 8)` shall always be dark and does not form part of the Format Information.
- <p align="center">
- <img alt="Darkmodule" src="">
- </p>
- ## Mode
- The *mode* is the method of representing a defined character set as a bit string, with a *mode indicator*, a four-bit identifier indicating in which mode the next data sequence is encoded.
- | Mode | Indicator | Description |
- |-------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
- | Numeric | `0001` | Numeric encoding, 10 bits per 3 digits |
- | Alphanumeric | `0010` | Alphanumeric encoding, 11 bits per 2 characters |
- | Byte | `0100` | Byte encoding, 8 bits per character |
- | Kanji | `1000` | [Kanji](https://en.wikipedia.org/wiki/Kanji) encoding (Japanese, [Shift-JIS](https://en.wikipedia.org/wiki/Shift_JIS)), 13 bits per character |
- | Hanzi<sup>*</sup> | `1101` | [Hanzi](https://en.wikipedia.org/wiki/Chinese_characters) encoding (simplified Chinese, [GB2312/GB18030](https://en.wikipedia.org/wiki/GB_18030)), 13 bits per character |
- | Structured append | `0011` | used to split a message across multiple (up to 16) QR symbols |
- | ECI | `0111` | [Extended Channel Interpretation](https://en.wikipedia.org/wiki/Extended_Channel_Interpretation) (select alternate character set or encoding) |
- | FNC1 in first position | `0101` | see [Code 128](https://en.wikipedia.org/wiki/Code_128), also [zxing/issues/1373](https://github.com/zxing/zxing/issues/1373) |
- | FNC1 in second position | `1001` | |
- | Terminator | `0000` | End of message |
- <sup>*</sup> Hanzi mode is not part of the ISO specification, but the Chinese standard [GB/T 18284](https://www.chinesestandard.net/PDF/English.aspx/GBT18284-2000)
- ### Segment
- Each segment consists of the 4 bit mode indicator followed by the data bit stream, where the content of the bit stream can vary depending on the mode:
- | Mode | Bit stream contents |
- |-------------------------|---------------------------------------------------------------------------------------------------------------------------|
- | Numeric | \[ `0001` : 4 ] \[ Character Count Indicator : variable ] \[ Data Bit Stream : 3 1⁄3 × charcount ] |
- | Alphanumeric | \[ `0010` : 4 ] \[ Character Count Indicator : variable ] \[ Data Bit Stream : 5 1⁄2 × charcount ] |
- | Byte | \[ `0100` : 4 ] \[ Character Count Indicator : variable ] \[ Data Bit Stream : 8 × charcount ] |
- | Kanji | \[ `1000` : 4 ] \[ Character Count Indicator : variable ] \[ Data Bit Stream : 13 × charcount ] |
- | Hanzi | \[ `1101` : 4 ] \[ Subset Indicator : 4 ] \[ Character Count Indicator : variable ] \[ Data Bit Stream : 13 × charcount ] |
- | Structured append | \[ `0011` : 4 ] \[ Symbol Position : 4 ] \[ Total Symbols : 4 ] \[ Parity : 8 ] |
- | ECI | \[ `0111` : 4 ] \[ ECI Assignment number : variable ] |
- | FNC1 in first position | \[ `0101` : 4 ] \[ Numeric/Alphanumeric/Byte/Kanji/Hanzi payload : variable ] |
- | FNC1 in second position | \[ `1001` : 4 ] \[ Application Indicator : 8 ] \[ Numeric/Alphanumeric/Byte/Kanji/Hanzi payload : variable ] |
- | Terminator | \[ `0000` : 4 ] |
- The lenght of the Character Count Indicator for Numeric/Alphanumeric/Byte/Kanji/Hanzi varies, depending on the version:
- | Mode | Version 1-9 | Version 10-26 | Version 27-40 |
- |--------------|-------------|---------------|---------------|
- | Numeric | 10 | 12 | 14 |
- | Alphanumeric | 9 | 11 | 13 |
- | Byte | 8 | 16 | 16 |
- | Kanji/Hanzi | 8 | 10 | 12 |
- ### Extended Channel Interpretation (ECI)
- [Extended Channel Interpretation](https://en.wikipedia.org/wiki/Extended_Channel_Interpretation) can be used to indicate an
- alternate character encoding for the following Byte segment (by default, ISO-8859-1 "Latin-1").
- An ECI segment starts with the 4 bit indicator `0111` followed by the ECI Assignment number (8, 16 or 24 bits),
- followed by a Byte segment (`0100` ...) where the contents are encoded according to the preceding ECI ID.
- The length of the ECI Assignment number depends on the given encoding ID:
- | ID | length (bits) |
- |----------------|---------------|
- | 0 - 127 | 8 |
- | 128 - 16383 | 16 |
- | 16384 - 999999 | 24 |
- ### Mixed Mode
- Encoding modes can be mixed as needed within a QR symbol in order to optimize data usage.
- Each segment of data is encoded in the appropriate mode, with the basic structure
- *Mode Indicator / Character Count Indicator / Data* and followed immediately by the Mode Indicator commencing the next segment.
- \[ Mode Indicator 1 ]\[ Mode bitstream 1 ]<br>
- ...<br>
- \[ Mode Indicator n ]\[ Mode bitstream n ]<br>
- ...<br>
- \[ `0000` End of message (Terminator) ]
- ## ECC (Error Correction Coding)
- QR codes use [Reed–Solomon error correction](https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction) that allow QR code readers to detect and correct errors.
- A detailed breakdown of the process can be found at [thonky.com - QR Code Tutorial](https://www.thonky.com/qr-code-tutorial/error-correction-coding).
- ### ECC Level
- The number of data versus error correction bytes within each block depends on the version of the QR symbol and the error
- correction level. The higher the error correction level, the less storage capacity. The following table lists the approximate
- error correction capability at each of the four levels:
- | Level | Short | Capacity | Indicator |
- |----------|-------|----------|-----------|
- | Low | L | 7% | `01` |
- | Medium | M | 15% | `00` |
- | Quartile | Q | 25% | `11` |
- | High | H | 30% | `10` |
- ### Maximum data capacity
- The maximum data capacity of a QR Code at version 40 for each ECC level and mode is shown in the following table:
- | ECC | max. bits | Numeric | Alphanumeric | Binary | Kanji/Hanzi <sup>*</sup> |
- |-----|-----------|---------|--------------|--------|--------------------------|
- | L | 23648 | 7089 | 4296 | 2953 | 1817 |
- | M | 18672 | 5596 | 3391 | 2331 | 1435 |
- | Q | 13328 | 3993 | 2420 | 1663 | 1024 |
- | H | 10208 | 3057 | 1852 | 1273 | 784 |
- <sup>*</sup> Hanzi mode stores one character less than Kanji as it uses an additional subset indicator of 4 bits length.
- ## Data masking
- Masking is the process of XORing the bit pattern in the encoding region with a masking pattern to provide a symbol with more
- evenly balanced numbers of dark and light modules and reduced occurrence of patterns which would interfere with fast processing of the image.
- ### Evaluation
- The mask pattern evaluation is done for each of the 8 mask patterns, the pattern with the lowest penalty score shall be used for the final output.
- During the evaluation, 4 rules are applied to get the penalty score:
- - find repetitive cells with the same color Example: 00000 or 11111 (horizontal and vertical).
- - find 2×2 blocks with the same color
- - find consecutive runs of 1:1:3:1:1:4 starting with black, or 4:1:1:3:1:1 starting with white
- - calculate the ratio of dark cells and give increasing penalty if the ratio is far from 50%
- ### Mask pattern
- | Pattern | Mask<sup>*</sup> | Example |
- |---------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
- | `000` | `(x + y) mod 2 = 0` | <img alt="Mask pattern 000" style="margin: 0.25em;" src=""> |
- | `001` | `y mod 2 = 0` | <img alt="Mask pattern 001" style="margin: 0.25em;" src=""> |
- | `010` | `x mod 3 = 0` | <img alt="Mask pattern 010" style="margin: 0.25em;" src=""> |
- | `011` | `(x + y) mod 3 = 0` | <img alt="Mask pattern 011" style="margin: 0.25em;" src=""> |
- | `100` | `((y intdiv 2) + (x intdiv 3)) mod 2 = 0` | <img alt="Mask pattern 100" style="margin: 0.25em;" src=""> |
- | `101` | `(x y) mod 2 + (x y) mod 3 = 0`<br>or:<br>`(x y) mod 6 = 0` | <img alt="Mask pattern 101" style="margin: 0.25em;" src=""> |
- | `110` | `((x y) mod 2 + (x y) mod 3) mod 2 = 0`<br>or:<br>`(x y) mod 6 < 3` | <img alt="Mask pattern 110" style="margin: 0.25em;" src=""> |
- | `111` | `((x y) mod 3 + (x + y) mod 2) mod 2 = 0`<br>or:<br>`(x + y + (x y) mod 3) mod 2 = 0` | <img alt="Mask pattern 111" style="margin: 0.25em;" src=""> |
- <sup>*</sup> where `x` = column (width) and `y` = row (height), with `x,y = 0,0` for the top left module<br>
- ## Reflectance
- Symbols are intended to be read when either dark on light or light on dark.
- The International Standard (ISO/IEC 18004) is based on dark images on a light background (example on the left),
- reflectance reversal therefore means a light image on dark background (example on the right).
- <p align="center">
- <img alt="Normal reflectance" style="margin: 0.25em;" src="">
- <img alt="Reversed reflectance" style="margin: 0.25em;" src="">
- </p>
|