class Poco::Latin9Encoding

Overview

ISO Latin-9 (8859-15) text encoding. More…

#include <Latin9Encoding.h>

class Latin9Encoding: public Poco::TextEncoding
{
public:
    // methods

    virtual
    const char*
    canonicalName() const;

    virtual
    bool
    isA(const std::string& encodingName) const;

    virtual
    const CharacterMap&
    characterMap() const;

    virtual
    int
    convert(const unsigned char* bytes) const;

    virtual
    int
    convert(
        int ch,
        unsigned char* bytes,
        int length
        ) const;

    virtual
    int
    queryConvert(
        const unsigned char* bytes,
        int length
        ) const;

    virtual
    int
    sequenceLength(
        const unsigned char* bytes,
        int length
        ) const;
};

Inherited Members

public:
    // typedefs

    typedef SharedPtr<TextEncoding> Ptr;
    typedef int CharacterMap[256];

    // enums

    enum
    {
        MAX_SEQUENCE_LENGTH = 6,
    };

    // fields

    static const std::string GLOBAL;

    // methods

    virtual
    const char*
    canonicalName() const = 0;

    virtual
    bool
    isA(const std::string& encodingName) const = 0;

    virtual
    const CharacterMap&
    characterMap() const = 0;

    virtual
    int
    convert(const unsigned char* bytes) const;

    virtual
    int
    queryConvert(
        const unsigned char* bytes,
        int length
        ) const;

    virtual
    int
    sequenceLength(
        const unsigned char* bytes,
        int length
        ) const;

    virtual
    int
    convert(
        int ch,
        unsigned char* bytes,
        int length
        ) const;

    static
    TextEncoding&
    byName(const std::string& encodingName);

    static
    TextEncoding::Ptr
    find(const std::string& encodingName);

    static
    void
    add(TextEncoding::Ptr encoding);

    static
    void
    add(
        TextEncoding::Ptr encoding,
        const std::string& name
        );

    static
    void
    remove(const std::string& encodingName);

    static
    TextEncoding::Ptr
    global(TextEncoding::Ptr encoding);

    static
    TextEncoding&
    global();

protected:
    // methods

    static
    TextEncodingManager&
    manager();

Detailed Documentation

ISO Latin-9 (8859-15) text encoding.

Latin-9 is basically Latin-1 with the EURO sign plus some other minor changes.

Methods

virtual
const char*
canonicalName() const

Returns the canonical name of this encoding, e.g.

“ISO-8859-1”. Encoding name comparisons are case insensitive.

virtual
bool
isA(const std::string& encodingName) const

Returns true if the given name is one of the names of this encoding.

For example, the “ISO-8859-1” encoding is also known as “Latin-1”.

Encoding name comparision are be case insensitive.

virtual
const CharacterMap&
characterMap() const

Returns the CharacterMap for the encoding.

The CharacterMap should be kept in a static member. As characterMap() can be called frequently, it should be implemented in such a way that it just returns a static map. If the map is built at runtime, this should be done in the constructor.

virtual
int
convert(const unsigned char* bytes) const

The convert function is used to convert multibyte sequences; bytes will point to a byte sequence of n bytes where sequenceLength(bytes, length) == -n, with length >= n.

The convert function must return the Unicode scalar value represented by this byte sequence or -1 if the byte sequence is malformed. The default implementation returns (int) bytes[0].

virtual
int
convert(
    int ch,
    unsigned char* bytes,
    int length
    ) const

Transform the Unicode character ch into the encoding’s byte sequence.

The method returns the number of bytes used. The method must not use more than length characters. Bytes and length can also be null - in this case only the number of bytes required to represent ch is returned. If the character cannot be converted, 0 is returned and the byte sequence remains unchanged. The default implementation simply returns 0.

virtual
int
queryConvert(
    const unsigned char* bytes,
    int length
    ) const

The queryConvert function is used to convert single byte characters or multibyte sequences; bytes will point to a byte sequence of length bytes.

The queryConvert function must return the Unicode scalar value represented by this byte sequence or -1 if the byte sequence is malformed or -n where n is number of bytes requested for the sequence, if lenght is shorter than the sequence. The length of the sequence might not be determined by the first byte, in which case the conversion becomes an iterative process: First call with length == 1 might return -2, Then a second call with lenght == 2 might return -4 Eventually, the third call with length == 4 should return either a Unicode scalar value, or -1 if the byte sequence is malformed. The default implementation returns (int) bytes[0].

virtual
int
sequenceLength(
    const unsigned char* bytes,
    int length
    ) const

The sequenceLength function is used to get the lenth of the sequence pointed by bytes.

The length paramater should be greater or equal to the length of the sequence.

The sequenceLength function must return the lenght of the sequence represented by this byte sequence or a negative value -n if length is shorter than the sequence, where n is the number of byte requested to determine the length of the sequence. The length of the sequence might not be determined by the first byte, in which case the conversion becomes an iterative process as long as the result is negative: First call with length == 1 might return -2, Then a second call with lenght == 2 might return -4 Eventually, the third call with length == 4 should return 4. The default implementation returns 1.