FSW Module
The fsw module contains functions for handling Formal SignWriting in ASCII (FSW) characters.
FSW characters definition: https://datatracker.ietf.org/doc/id/draft-slevinski-formal-signwriting-10.html#name-formal-signwriting-in-ascii
- sutton_signwriting_core.fsw.fsw_is_type(sym_key, type_name)
Test whether an FSW symbol key is of the given type/range.
- Parameters:
sym_key (str) – an FSW symbol key
type_name (str) – the name of a symbol range
- Returns:
True if symbol of specified type
- Return type:
bool
Example
>>> fsw_is_type('S10000', 'hand') True
Note
The following type_name values are supported:
all - All symbols used in Formal SignWriting.
writing - Symbols that can be used in the spatial signbox or the temporal prefix.
hand - Various handshapes.
movement - Contact symbols, small finger movements, straight arrows, curved arrows, and circles.
dynamic - Dynamic symbols used to express “feeling” or “tempo” of movement.
head - Symbols for the head and face.
hcenter - Used to determine the horizontal center of a sign (same as the head type).
vcenter - Used to determine the vertical center of a sign (includes head and trunk types).
trunk - Symbols for torso movement, shoulders, and hips.
limb - Symbols for limbs and fingers.
location - Detailed location symbols used only in the temporal prefix.
punctuation - Symbols used to divide signs into sentences.
- sutton_signwriting_core.fsw.fsw_colorize(key)
Function that returns the standardized color for a symbol.
- Parameters:
key (str) – an FSW symbol key
- Returns:
name of standardized color for symbol
- Return type:
str
Example
>>> fsw_colorize('S10000') '#0000CC'
- sutton_signwriting_core.fsw.fsw_parse_symbol(fsw_sym)
Parse an FSW symbol with optional coordinate and style string.
- Parameters:
fsw_sym (str) – an FSW symbol string
- Returns:
Dictionary with ‘symbol’, ‘coord’, ‘style’ keys
- Return type:
Example
>>> fsw_parse_symbol('S10000500x500-C') {'symbol': 'S10000', 'coord': [500, 500], 'style': '-C'}
- sutton_signwriting_core.fsw.fsw_parse_sign(fsw_sign)
Parse an FSW sign with optional style string.
- Parameters:
fsw_sign (str) – an FSW sign string
- Returns:
Dictionary with ‘sequence’, ‘box’, ‘max’, ‘spatials’, ‘style’ keys
- Return type:
Example
>>> fsw_parse_sign('AS10011S10019S2e704S2e748M525x535S2e748483x510S10011501x466S2e704510x500S10019476x475-C') {'sequence': ['S10011', 'S10019', 'S2e704', 'S2e748'], 'box': 'M', 'max': [525, 535], 'spatials': [{'symbol': 'S2e748', 'coord': [483, 510]}, {'symbol': 'S10011', 'coord': [501, 466]}, {'symbol': 'S2e704', 'coord': [510, 500]}, {'symbol': 'S10019', 'coord': [476, 475]}], 'style': '-C'}
- sutton_signwriting_core.fsw.fsw_parse_text(fsw_text)
Parse an FSW text into signs and punctuations.
- Parameters:
fsw_text (str) – an FSW text string
- Returns:
List of FSW signs and punctuations
- Return type:
List[str]
Example
>>> fsw_parse_text('AS14c20S27106M518x529S14c20481x471S27106503x489 AS18701S1870aS2e734S20500M518x533S1870a489x515S18701482x490S20500508x496S2e734500x468 S38800464x496') ['AS14c20S27106M518x529S14c20481x471S27106503x489', 'AS18701S1870aS2e734S20500M518x533S1870a489x515S18701482x490S20500508x496S2e734500x468', 'S38800464x496']
- sutton_signwriting_core.fsw.fsw_compose_symbol(fsw_sym_object)
Function to compose an fsw symbol with optional coordinate and style string.
- Parameters:
fsw_sym_object (SymbolObject) – an fsw symbol object
- Returns:
an fsw symbol string
- Return type:
str | None
Example
>>> fsw_compose_symbol({'symbol': 'S10000', 'coord': [480, 480], 'style': '-C'}) 'S10000480x480-C'
- sutton_signwriting_core.fsw.fsw_compose_sign(fsw_sign_object)
Function to compose an fsw sign with style string.
- Parameters:
fsw_sign_object (SignObject) – an fsw sign object
- Returns:
an fsw sign string
- Return type:
str | None
Example
>>> fsw_compose_sign({ ... 'sequence': ['S10011', 'S10019', 'S2e704', 'S2e748'], ... 'box': 'M', ... 'max': [525, 535], ... 'spatials': [ ... {'symbol': 'S2e748', 'coord': [483, 510]}, ... {'symbol': 'S10011', 'coord': [501, 466]}, ... {'symbol': 'S2e704', 'coord': [510, 500]}, ... {'symbol': 'S10019', 'coord': [476, 475]} ... ], ... 'style': '-C' ... }) 'AS10011S10019S2e704S2e748M525x535S2e748483x510S10011501x466S2e704510x500S10019476x475-C'
- sutton_signwriting_core.fsw.fsw_info(fsw)
Function to gather sizing information about an fsw sign or symbol.
- Parameters:
fsw (str) – an fsw sign or symbol
- Returns:
information about the fsw string
- Return type:
Example
>>> fsw_info('AS14c20S27106L518x529S14c20481x471S27106503x489-P10Z2') { 'minX': 481, 'minY': 471, 'width': 37, 'height': 58, 'lane': -1, 'padding': 10, 'segment': 'sign', 'zoom': 2 }
- sutton_signwriting_core.fsw.fsw_column_defaults_merge(options=None)
Function to merge an object of column options with default values.
- Parameters:
options (ColumnOptions | None) – object of column options
- Returns:
object of column options merged with column defaults
- Return type:
Example
>>> fsw_column_defaults_merge({'height': 500, 'width': 150}) {'height': 500, 'width': 150, 'offset': 50, ...}
- sutton_signwriting_core.fsw.fsw_columns(fsw_text, options=None)
Function to transform an FSW text to an array of columns.
- Parameters:
fsw_text (str) – FSW text of signs and punctuation
options (ColumnOptions | None) – object of column options
- Returns:
object of column options, widths array, and column data
- Return type:
Example
>>> fsw_columns('AS14c20S27106M518x529S14c20481x471S27106503x489 AS18701S1870aS2e734S20500M518x533S1870a489x515S18701482x490S20500508x496S2e734500x468 S38800464x496', {'height': 500, 'width': 150}) {'options': {...}, 'widths': [150], 'columns': [[{'x': 56, 'y': 20, ...}, ...]]}
- sutton_signwriting_core.fsw.fsw_tokenize(fsw, sequence=True, signbox=True, sep='[SEP]')
Tokenizes an FSW string into an array of tokens
- Parameters:
fsw (str) – FSW string to tokenize
sequence (bool) – Whether to include sequence tokens
signbox (bool) – Whether to include signbox tokens
sep (str | None) – Separator token
- Returns:
Array of tokens
- Return type:
List[str]
Example
>>> fsw_tokenize("AS10e00M507x515S10e00492x485", sequence=False, sep=None) ['M', 'p507', 'p515', 'S10e', 'c0', 'r0', 'p492', 'p485']
- sutton_signwriting_core.fsw.fsw_detokenize(tokens, special_tokens=[{'index': 0, 'name': 'UNK', 'value': '[UNK]'}, {'index': 1, 'name': 'PAD', 'value': '[PAD]'}, {'index': 2, 'name': 'CLS', 'value': '[CLS]'}, {'index': 3, 'name': 'SEP', 'value': '[SEP]'}])
Converts an array of tokens back into an FSW string
- Parameters:
tokens (List[str]) – Array of tokens to convert
special_tokens (List[SpecialToken]) – Array of special token objects to filter out
- Returns:
FSW string
- Return type:
str
Example
>>> fsw_detokenize(['M', 'p507', 'p515', 'S10e', 'c0', 'r0', 'p492', 'p485']) "M507x515S10e00492x485"
- sutton_signwriting_core.fsw.fsw_chunk_tokens(tokens, chunk_size, cls='[CLS]', sep='[SEP]', pad='[PAD]')
Splits tokens into chunks of specified size while preserving sign boundaries
- Parameters:
tokens (List[str]) – Array of tokens to chunk
chunk_size (int) – Maximum size of each chunk
cls (str) – CLS token
sep (str) – SEP token
pad (str) – PAD token
- Returns:
Array of token chunks
- Return type:
List[List[str]]
- class sutton_signwriting_core.fsw.FSWTokenizer(special_tokens=[{'index': 0, 'name': 'UNK', 'value': '[UNK]'}, {'index': 1, 'name': 'PAD', 'value': '[PAD]'}, {'index': 2, 'name': 'CLS', 'value': '[CLS]'}, {'index': 3, 'name': 'SEP', 'value': '[SEP]'}], starting_index=None)
Bases:
objectCreates a tokenizer object with encoding and decoding capabilities.
- Parameters:
special_tokens (List[SpecialToken]) – Special tokens list
starting_index (int | None) – Starting index for regular tokens
Example
>>> t = FSWTokenizer() >>> t.encode('M507x515S10e00492x485') [7, 941, 949, 24, 678, 662, 926, 919, 3]
- vocab()
- Return type:
List[str]
- encode_tokens(tokens)
- Parameters:
tokens (List[str])
- Return type:
List[int]
- decode_tokens(indices)
- Parameters:
indices (List[int])
- Return type:
List[str]
- encode(text, sequence=True, signbox=True)
- Parameters:
text (str)
sequence (bool)
signbox (bool)
- Return type:
List[int]
- decode(tokens)
- Parameters:
tokens (List[int] | List[List[int]])
- Return type:
str
- chunk(tokens, chunk_size)
- Parameters:
tokens (List[str])
chunk_size (int)
- Return type:
List[List[str]]