WIKINDX API 6.7.2

UTF8.php

WIKINDX : Bibliographic Management system.

Functions

html_numeric_entity_decode() : string: Convert numeric HTML entities to their corresponding characters
mb_ucfirst() : string: A unicode aware replacement for ucfirst()
mb_str_word_count() : int|array<string|int, string>: count UTF-8 words in a string
mb_explode() : string: Simulate explode() for multibytes strings (as documented for PHP 7.0)
mb_str_pad() : string: Simulate str_pad() for multibytes strings
mb_strcasecmp() : string: Simulate strcasecmp() for multibytes strings
mb_strrev() : string: Simulate strrev() for multibytes strings
mb_substr_replace() : string: Simulate substr_replace() for multibytes strings
mb_trim() : string: Code by Ben XO at https://www.php.net/manual/en/ref.mbstring.php

html_numeric_entity_decode()

Convert numeric HTML entities to their corresponding characters


    
                    html_numeric_entity_decode(string $str) : string

Act like html_entity_decode() builtin but converts also control characters.

Parameters

$str : string

Return values

string

mb_ucfirst()

A unicode aware replacement for ucfirst()


    
                    mb_ucfirst(string $str) : string

Parameters

$str : string

Return values

string

mb_str_word_count()

count UTF-8 words in a string


    
                    mb_str_word_count(string $str[, string $format = 0 ][, string $charlist = "" ]) : int|array<string|int, string>

This simple utf-8 word count function (it only counts) is a bit faster then the one with preg_match_all about 10x slower then the built-in str_word_count

If you need the hyphen or other code points as word-characters just put them into the [brackets] like [^\p{L}\p{N}'-]

Parameters

$str : string
$format : string = 0
$charlist : string = ""

Return values

int|array<string|int, string>

mb_explode()

Simulate explode() for multibytes strings (as documented for PHP 7.0)


    
                    mb_explode(string $delimiter, string $string[, int $limit = PHP_INT_MAX ]) : string

Parameters

$delimiter : string
$string : string
$limit : int = PHP_INT_MAX: Default is PHP_INT_MAX.

Return values

string

mb_str_pad()

Simulate str_pad() for multibytes strings


    
                    mb_str_pad(string $str, int $pad_len[, string $pad_str = ' ' ][, string $dir = STR_PAD_RIGHT ][, string $encoding = NULL ]) : string

Parameters

$str : string
$pad_len : int
$pad_str : string = ' ': Default is ' '.
$dir : string = STR_PAD_RIGHT: Default is STR_PAD_RIGHT.
$encoding : string = NULL: Default is NULL.

Return values

string

mb_strcasecmp()

Simulate strcasecmp() for multibytes strings


    
                    mb_strcasecmp(string $str1, string $str2[, string $encoding = NULL ]) : string

A simple multibyte-safe case-insensitive string comparison

Parameters

$str1 : string
$str2 : string
$encoding : string = NULL: Default is NULL.

Return values

string

mb_strrev()

Simulate strrev() for multibytes strings


    
                    mb_strrev(string $str) : string

Parameters

$str : string

Return values

string

mb_substr_replace()

Simulate substr_replace() for multibytes strings


    
                    mb_substr_replace(string $string, string $replacement, int $start[, int $length = NULL ][, string $encoding = NULL ]) : string

Parameters

$string : string
$replacement : string
$start : int
$length : int = NULL: Default is NULL.
$encoding : string = NULL: Default is NULL.

Return values

string

mb_trim()

Code by Ben XO at https://www.php.net/manual/en/ref.mbstring.php


    
                    mb_trim(string $string[, string $charlist = '\\s' ][, bool $ltrim = TRUE ][, bool $rtrim = TRUE ]) : string

Trim characters from either (or both) ends of a string in a way that is multibyte-friendly.

Mostly, this behaves exactly like trim() would: for example supplying 'abc' as the charlist will trim all 'a', 'b' and 'c' chars from the string, with, of course, the added bonus that you can put unicode characters in the charlist.

We are using a PCRE character-class to do the trimming in a unicode-aware way, so we must escape ^, , - and ] which have special meanings here. As you would expect, a single \ in the charlist is interpretted as "trim backslashes" (and duly escaped into a double-\ ). Under most circumstances you can ignore this detail.

As a bonus, however, we also allow PCRE special character-classes (such as '\s') because they can be extremely useful when dealing with UCS. '\pZ', for example, matches every 'separator' character defined in Unicode, including non-breaking and zero-width spaces.

It doesn't make sense to have two or more of the same character in a character class, therefore we interpret a double \ in the character list to mean a single \ in the regex, allowing you to safely mix normal characters with PCRE special classes.

Be careful when using this bonus feature, as PHP also interprets backslashes as escape characters before they are even seen by the regex. Therefore, to specify '\s' in the regex (which will be converted to the special character class '\s' for trimming), you will usually have to put 4 backslashes in the PHP code - as you can see from the default value of $charlist.

Parameters

$string : string: The string to trim
$charlist : string = '\\s': charlist list of characters to remove from the ends
$ltrim : bool = TRUE: trim the left? (Default is TRUE)
$rtrim : bool = TRUE: trim the right? (Default is TRUE)

Return values

string

WIKINDX API 6.7.2

UTF8.php

Tags

Table of Contents

Functions

Functions

html_numeric_entity_decode()

Parameters

Tags

Return values

mb_ucfirst()

Parameters

Tags

Return values

mb_str_word_count()

Parameters

Tags

Return values

mb_explode()

Parameters

Return values

mb_str_pad()

Parameters

Return values

mb_strcasecmp()

Parameters

Return values

mb_strrev()

Parameters

Return values

mb_substr_replace()

Parameters

Return values

mb_trim()

Parameters

Return values

Search results