Arabic is not so simple as Robert Firth supposes.
(1) Blanks between word and between letters inside a word are not very clear.
Some letters need a small blank after them.
(2) Vowels can be written and can be omitted
(3) Points above and below consonants are very sufficient. If they are omitted,
the text is still readable but poor recognisable.
For example, B,T,N,I share the same letter with different points above and below.
(4) the end of the previous word makes change of the beginning of the next word
For example,
BISMILLAH-IRRAHMAN-URRAHIM, is actually a composition of the following parts:
Bi-SMI-ALLAHI-AL-RAHMANU-AL-RAHIMI
So, here is a lot of ways to encript of arabic text
1.ignoring vocals / not ignoring vocals - it is a common to omit the vocals
2.using acustical consonant change at word connection / not using this consonant
change
3. ignoring the punctuation difference between consonants / not ignoring such
difference
4. Ignoring/not ignoring the formal suffix "-UN" at all the subjects, that is
not prononsed
5. Using different rules to set blanks between words and particles
6. Encipher left-to-the-write or right-to-the-left
These 6 points are enough to make it a difficult recognising of the cipher
structure for arabic.
What are the arguments for arabics?
(1) In some marginal cases of enciphering, the Entropy becomes low.
(2) the letters have different initial and final form, dependent on a blank
before and after it
(3) practically each "legal" combination of letters (see a Stolfi's grammar)
becomes a legal word
(4) the blanks have not very striong meaning,
the statistical distribution of the 2-letter combinations is very similar
to the distribution of 2-letter combinations of letters before and after the blank.