public final class BidiAlgorithm extends Object
This implementation is not optimized for performance. It is intended as a reference implementation that closely follows the specification of the Bidirectional Algorithm in The Unicode Standard version 6.3.
Input:
There are two levels of input to the algorithm, since clients may prefer to
supply some information from out-of-band sources rather than relying on the
default behavior.
Output:
Output is separated into several stages as well, to better enable clients to
evaluate various aspects of implementation conformance.
As the algorithm is defined to operate on a single paragraph at a time, this implementation is written to handle single paragraphs. Thus rule P1 is presumed by this implementation-- the data provided to the implementation is assumed to be a single paragraph, and either contains no 'B' codes, or a single 'B' code at the end of the input. 'B' is allowed as input to illustrate how the algorithm assigns it a level.
Also note that rules L3 and L4 depend on the rendering engine that uses the result of the bidi algorithm. This implementation assumes that the rendering engine expects combining marks in visual order (e.g. to the left of their base character in RTL runs) and that it adjusts the glyphs used to render mirrored characters that are in RTL runs so that they render appropriately.
Modifier and Type | Field and Description |
---|---|
static byte |
AL
Right-to-Left Arabic
|
static byte |
AN
Arabic Number
|
static byte |
B
Paragraph Separator
|
static byte |
BN
Boundary Neutral
|
static byte |
CS
Common Number Separator
|
static byte |
EN
European Number
|
static byte |
ES
European Number Separator
|
static byte |
ET
European Number Terminator
|
static byte |
FSI
First-Strong Isolate
|
static byte |
implicitEmbeddingLevel |
static byte |
L
Left-to-right
|
static byte |
LRE
Left-to-Right Embedding
|
static byte |
LRI
Left-to-Right Isolate
|
static byte |
LRO
Left-to-Right Override
|
static int |
MAX_DEPTH |
static byte |
NSM
Non-Spacing Mark
|
static byte |
ON
Other Neutrals
|
BidiPBAAlgorithm |
pba |
static byte |
PDF
Pop Directional Format
|
static byte |
PDI
Pop Directional Isolate
|
static byte |
R
Right-to-Left
|
static byte |
RLE
Right-to-Left Embedding
|
static byte |
RLI
Right-to-Left Isolate
|
static byte |
RLO
Right-to-Left Override
|
static byte |
S
Segment Separator
|
static byte |
TYPE_MAX
Maximum bidi type value.
|
static byte |
TYPE_MIN
Minimum bidi type value.
|
static String[] |
typenames
Shorthand names of bidi type values, for error reporting.
|
static byte |
WS
Whitespace
|
Constructor and Description |
---|
BidiAlgorithm(byte[] types,
byte[] pairTypes,
int[] pairValues)
Initialize using several arrays, then run the algorithm
|
BidiAlgorithm(byte[] types,
byte[] pairTypes,
int[] pairValues,
byte paragraphEmbeddingLevel)
Initialize using several arrays of direction and other types and an externally supplied
paragraph embedding level.
|
Modifier and Type | Method and Description |
---|---|
static BidiAlgorithm |
analyzeInput(byte[] types,
byte[] pairTypes,
int[] pairValues,
byte paragraphEmbeddingLevel)
static entry point for testing using several arrays of direction and other types and an externally supplied
paragraph embedding level.
|
static int[] |
computeReordering(byte[] levels)
Return reordering array for a given level array.
|
byte |
getBaseLevel()
Return the base level of the paragraph.
|
byte[] |
getLevels(int[] linebreaks)
Return levels array breaking lines at offsets in linebreaks.
Rule L1. |
int[] |
getReordering(int[] linebreaks)
Return reordering array breaking lines at offsets in linebreaks.
|
byte[] |
getResultTypes() |
static int[] |
inverseReordering(int[] reordering) |
static boolean |
isRemovedByX9(byte biditype)
Return true if the type is one of the types removed in X9.
|
public static final byte implicitEmbeddingLevel
public BidiPBAAlgorithm pba
public static final byte L
public static final byte LRE
public static final byte LRO
public static final byte R
public static final byte AL
public static final byte RLE
public static final byte RLO
public static final byte PDF
public static final byte EN
public static final byte ES
public static final byte ET
public static final byte AN
public static final byte CS
public static final byte NSM
public static final byte BN
public static final byte B
public static final byte S
public static final byte WS
public static final byte ON
public static final byte LRI
public static final byte RLI
public static final byte FSI
public static final byte PDI
public static final byte TYPE_MIN
public static final byte TYPE_MAX
public static final String[] typenames
public static final int MAX_DEPTH
public BidiAlgorithm(byte[] types, byte[] pairTypes, int[] pairValues)
types
- Array of types ranging from TYPE_MIN to TYPE_MAX inclusive
and representing the direction codes of the characters in the text.pairTypes
- Array of paired bracket types ranging from 0 (none) to 2 (closing)
of the characterspairValues
- Array identifying which set of matching bracket characters
as defined in BidiPBAReference (note, both opening and closing
bracket get the same value if they are part of the same canonical "set"
or pair)public BidiAlgorithm(byte[] types, byte[] pairTypes, int[] pairValues, byte paragraphEmbeddingLevel)
2 means to apply the default algorithm (rules P2 and P3), 0 is for LTR paragraphs, and 1 is for RTL paragraphs.
types
- the types arraypairTypes
- the paired bracket types arraypairValues
- the paired bracket values arrayparagraphEmbeddingLevel
- the externally supplied paragraph embedding level.public byte[] getResultTypes()
public byte[] getLevels(int[] linebreaks)
The returned levels array contains the resolved level for each bidi code passed to the constructor.
The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.
linebreaks
- the offsets at which to break the paragraphpublic int[] getReordering(int[] linebreaks)
The reordering array maps from a visual index to a logical index. Lines are concatenated from left to right. So for example, the fifth character from the left on the third line is
getReordering(linebreaks)[linebreaks[1] + 4]
(linebreaks[1] is the position after the last character of the second line, which is also the index of the first character on the third line, and adding four gets the fifth character from the left).
The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.
linebreaks
- the offsets at which to break the paragraph.public static int[] computeReordering(byte[] levels)
levels
- a given level arraypublic static int[] inverseReordering(int[] reordering)
public byte getBaseLevel()
public static boolean isRemovedByX9(byte biditype)
biditype
- biditypepublic static final BidiAlgorithm analyzeInput(byte[] types, byte[] pairTypes, int[] pairValues, byte paragraphEmbeddingLevel)
2 means to apply the default algorithm (rules P2 and P3), 0 is for LTR paragraphs, and 1 is for RTL paragraphs.
types
- the directional types arraypairTypes
- the paired bracket types arraypairValues
- the paired bracket values arrayparagraphEmbeddingLevel
- the externally supplied paragraph embedding level.Copyright © 1998–2017 iText Group NV. All rights reserved.