Unicode - Wikiwand
Universal Character Set characters - Wikiwand
Code point - Wikiwand
BMP
SMP
Astral Planes
Plain Text • Dylan Beattie • GOTO 2023 - YouTube ❗!important, 43:11, ASCII history, code page, Unicode, sorting, normalization, encoding, emoji, ligatures
Plain Text - Dylan Beattie - NDC Copenhagen 2022 - YouTube
Code page - Wikiwand
In ASCII days, code pages define what the top half of ASCII represents. It is sometimes bound to a particular use case/app.
Characters, Symbols and the Unicode Miracle - Computerphile - YouTube
EXTRA BITS - UTF-8 'nearly' works - Computerphile - YouTube
Unicode, in friendly terms: ASCII, UTF-8, code points, character encodings, and more - YouTube
These Keys Shouldn't Exist | Nostalgia Nerd - YouTube ASCII and broken pipe character, lingering as non-ASCII (Code page 437) for IBM PCs
Plain Text - Dylan Beattie - NDC Oslo 2021 - YouTube from encoding to Unicode, composition form, normalization form, UTF8, emoji
锟斤拷 �⊠ 是怎样炼成的——中文显示「⼊」门指南【柴知道】 - YouTube
Alt + Code point
to input unicode character
Special Characters Ø, ©, ±, °… [PC] | Tim Bird
Programming with Unicode — Programming with Unicode
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – Joel on Software !important
What every JavaScript developer should know about Unicode
Legacy Character Models and an Introduction to Unicode - Slide list
From Python PEP-261:
**Character**
Used by itself, means the addressable units of a Python Unicode string.
**Code point**
A code point is an integer between 0 and TOPCHAR. If you imagine Unicode as a mapping from integers to characters, each integer is a code point. But the integers between 0 and TOPCHAR that do not map to characters are also code points. Some will someday be used for characters. Some are guaranteed never to be used for characters.
**Codec**
A set of functions for translating between physical encodings (e.g. on disk or coming in from a network) into logical Python objects.
**Encoding**
Mechanism for representing abstract characters in terms of physical bits and bytes. Encodings allow us to store Unicode characters on disk and transmit them over networks in a manner that is compatible with other Unicode software.
**Surrogate pair**
Two physical characters that represent a single logical character. Part of a convention for representing 32-bit code points in terms of two 16-bit code points.
**Unicode string**
A Python type representing a sequence of code points with "string semantics" (e.g. case conversions, regular expression compatibility, etc.) Constructed with the unicode() function.
&what: Discover Unicode & HTML Character Entities
Math Unicode Entities
Unify – Unicode support on browsers and devices
表意文字小組 - Wikiwand
中日韓統一表意文字 - Wikiwand
UAX #38: Unicode Han Database (Unihan)
Combining Marks/Normalization
Combining character - Wikiwand
Zalgo Text Generator ― LingoJam 😄funny
FAQ - Normalization
Unicode equivalence - Wikiwand
String.prototype.normalize() - JavaScript | MDN
UAX #15: Unicode Normalization Forms
Normal Form Decomposed (NFD): é
(U+00E9) = e
+ ́
(U+0065 U+0301).
NFC — Normalization Form Canonical Composition, largest number of code points
NFD — Normalization Form Canonical Decomposition, smallest number of code points
NFKC — Normalization Form Compatibility Composition.
NFKD — Normalization Form Compatibility Decomposition.
Unicode Normalization forms - C# - OneCompiler
dotnet_summit_by.cs
Unicode 相容字元 - Wikiwand
Unicode compatibility characters - Wikiwand
Allows multiple glyphs for one code point
異體字選擇器 - Wikiwand
Variant form (Unicode) - Wikiwand
Encoding
UTF-8 - Wikiwand
UTF-16 - Wikiwand
Surrogates
RFC 3629 - UTF-8, a transformation format of ISO 10646
Byte order mark - Wikiwand
FAQ - UTF-8, UTF-16, UTF-32 & BOM
UTR#17: Unicode Character Encoding Model
research!rsc: UTF-8: Bits, Bytes, and Benefits
Hello World or Καλημέρα κόσμε or こんにちは 世界
Punycode Domain Name
RFC 3492 - Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)
Punycode converter (IDN converter), Punycode to Unicode 🔧
Phishing with Unicode Domains - Xudong Zheng
Internationalized Domain Names (IDN) in Google Chrome
Emoji
Emoji - Wikiwand
How emoji conquered the world | The Verge
The Oral History Of The Poop Emoji (Or, How Google Brought Poop To America) | Fast Company | Business + Innovation
Emoji and the Levitating Businessman - Computerphile - YouTube
Black Woman Astronaut = Woman (U+1F469) + Dark Skin Tone (U+1F3FF) + Zero Width Joiner (U+200D) + Rocket (U+1FD80D)
iEmoji.com
Emoji searcher
📙 Emojipedia — 😃 Home of Emoji Meanings 💁👌🎍😍
😋 Get Emoji — List of all Emojis to ✂ Copy and 📋 Paste 👌
emojidex - custom emoji service and apps
Full Emoji List, v14.0
🎁 Emoji cheat sheet for GitHub, Basecamp, Slack & more
Intro to Emoji URLs - DEV Community
Library
muan/mojibar: Emoji searcher but as a menubar app.
Twemoji
twitter/twemoji: Emoji for everyone. https://twemoji.twitter.com/
Open sourcing Twitter emoji for everyone
JoyPixels® - Freemium emoji icons. Emoji font licensing.
NeelShah18/emot: Open source Emoticons and Emoji detection library: emot
omnidan/node-emoji: simple emoji support for node.js projects
denosaurs/emoji: 🦄 Emojis for dinosaurs
Font
android - CSS reference to phone's Emoji font? - Stack Overflow
jslegers/emoji-icon-font: An experimental icon font
Twemoji Awesome | Like Font Awesome, but for Twitter Emoji.
EmojiSymbols Font
MorbZ/OpenSansEmoji: OpenSans based font which includes the full iOS Emoji set
EmojiSymbols Font
Google Noto Fonts - Noto Emoji
Google Noto Fonts - Noto Color Emoji
Emoji on the Web – Making Faces (and Other Emoji) – Medium
Character Table
Unicode character table
Unicode/UTF-8-character table
Unicodinator
Find all Unicode characters from Hieroglyphs to Dingbats – Codepoints
Unicode codepoint lookup/search tool
&what: Discover Unicode & HTML Character Entities
Unicode Characters ☯ ⚡ ∑ ♥ 😄
&what: Discover Unicode & HTML Character Entities
Graphemica - For people who ♥ letters, numbers, punctuation, &c
Code Charts (Unicode official one, PDFs)
List of Unicode characters - Wikiwand
Unicode Table
Unicode/UTF-8-character table
Typography Cheatsheet → A Comprehensive Guide to Smart Quotes, Dashes & Other Typographic Characters → Typewolf
Keycodes - Javascript Keyboard Codes, Character Codes, Unicode, HTML Entities
HTML Symbols – HTML Icon and Entity Code List
Shapecatcher: Draw the Unicode character you want!
Guobiao
國家標準代碼 - Wikiwand
国标码查询;汉字国家标准编码:GB2312、GBK、GB18030
2 bytes per character, with leading bit 1
Sorting
UTS #10: Unicode Collation Algorithm sorting
为什么汉字的“一二三四五六七八九十”的字典顺序和数字顺序不一致,而是“一七三九二五八六十四”? - 知乎
汉字 | UTF编码 |
---|---|
一 | 0x4e00 |
二 | 0x4e8c |
三 | 0x4e09 |
四 | 0x56db |
五 | 0x4e94 |
六 | 0x516d |
七 | 0x4e03 |
八 | 0x516b |
九 | 0x4e5d |
十 | 0x5341 |