A new algorithm to recognize the numbers from "one" to "ten" in the arabic language in this thesis. Analyzing a series of images for the speaker is the backboon of the thesis, detecting the first frame, the end frame, the motion vectore, and finally extracting the moment features from the motion vectore. A database consist of ten speakers are used in testing the results, every speaker pronounce the ten words fifteen times. Three recognition methods are used, K-mean, Fuzzy_kmean, and K-NN. The database is devided into two parts, the first part is to test which recognition method is the best for our purpose, while the second is for testing Это и многое другое вы найдете в книге Visual speech recognition (Ikrami Eldirawy and Wesam Ashour)