Text Auto-completion on Android Using the Levenshtein Distance

In a previous tutorial, the Levenshtein distance is implemented to measure the distance between 2 words based on a vector.

This tutorial builds on that implementation to create an Android app that automatically gives suggestions to the user as he/she is typing. The distances between the input text and an English dictionary of 20,000 words are calculated. The words with the lowest distances are suggested to the user to automatically complete the text.

The tutorial covers the following:

  • Building the UI of the Android App
  • Levenshtein Distance Implementation
  • Comparing the Dictionary to the User Input and Making Suggestions

Building the UI of the Android App

This section creates the user interface of the Android app, which contains an EditText view that allows the user to enter text, in addition to 3 Button views for printing suggestions based on the calculated distances. The XML layout of the main activity is provided in the code below.

The layout of the activity is a vertical LinearLayout that holds 2 direct children. The first one is an EditText view that allows the user to enter text. To be able to access this view within the Java code, it’s given an ID of inputWord.

The second direct child of the root layout is another LinearLayout, which horizontally and equally splits the activity width across the 3 Button views holding the suggestions for the user. The IDs of these 3 buttons are firstSuggestion, secondSuggestion, and thirdSuggestion. The callback method for these 3 buttons is named selectWord(). When a button is clicked, then its text will be transferred to the EditText view.

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    xmlns:tools="http://schemas.android.com/tools"
    android:orientation="vertical"
    tools:context=".MainActivity">

    <EditText
        android:id="@+id/inputWord"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:hint="Enter a single word" />

    <LinearLayout
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:orientation="horizontal">
        <Button
            android:onClick="selectWord"
            android:id="@+id/firstSuggestion"
            android:layout_width="wrap_content"
            android:layout_height="match_parent"
            android:layout_weight="1"
            android:textAllCaps="false"
            android:text="First Word" />
        <Button
            android:onClick="selectWord"
            android:id="@+id/secondSuggestion"
            android:layout_width="wrap_content"
            android:layout_height="match_parent"
            android:layout_weight="1"
            android:textAllCaps="false"
            android:text="Second Word" />
        <Button
            android:onClick="selectWord"
            android:id="@+id/thirdSuggestion"
            android:layout_width="wrap_content"
            android:layout_height="match_parent"
            android:layout_weight="1"
            android:textAllCaps="false"
            android:text="Third Word" />
    </LinearLayout>

</LinearLayout>

The next figure shows what this layout looks like:

At this time, the user can enter text in the EditText view. Next, we need to accept that text from Java in order to compare it with the dictionary. The main structure of the main activity that contains only the onCreate() method is listed below, given that the XML layout is named activity_main.

public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
    }
}

Inside the MainActivity class header, 4 variables are created to hold the 4 views (1 EditText and 3 Button).

Within the onCreate() method, the EditText and the 3 Button views are returned into such 4 variables:

Because autocompletion suggestions should be given while the user is typing, we have to get the text from the EditText view for each change. For this reason, an TextChangedListener is attached to the EditText view to fire an event for each change in the text. This is shown in the code below. There are 3 callback methods inside the listener—beforeTextChanged(), onTextChanged(), and afterTextChanged().

enteredWord.addTextChangedListener(new TextWatcher() {
    @Override
    public void beforeTextChanged(CharSequence charSequence, int i, int i1, int i2) {

    }

    @Override
    public void onTextChanged(CharSequence charSequence, int i, int i1, int i2) {
        String word = enteredWord.getText().toString();
        if (word.length() == 0){
            return;
        }
    }

    @Override
    public void afterTextChanged(Editable editable) {

    }
});

The method we’ll use is the onTextChanged() method to return the current text within the EditText view inside the word String variable. Once the current text inside the EditText view is available, the next step is to compare it to the dictionary and suggest the closest words to the user for autocompletion.

Later, we’ll create a method named returnSuggestions(), which accepts the user input and the number of suggestions to be returned. This method returns a String array of the suggested words. Such words will then be used as the text for the 3 Button views.

Once a button is clicked, then the selectWord() callback method will be called. Its implementation is as follows. Put simply, it sets the text of the EditText view equal to the text on the button. Using the setSelection() method, the cursor is set at the end of the EditText view to have a better experience for the user to continue typing after the autocompleted word.

At this moment, here is the complete Java code of the main activity. The next section provides an overview of the Levenshtein distance implementation created in the previous tutorial.

public class MainActivity extends AppCompatActivity {
    EditText enteredWord;
    Button firstSuggestion;
    Button secondSuggestion;
    Button thirdSuggestion;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        enteredWord = findViewById(R.id.inputWord);
        firstSuggestion = findViewById(R.id.firstSuggestion);
        secondSuggestion = findViewById(R.id.secondSuggestion);
        thirdSuggestion = findViewById(R.id.thirdSuggestion);
        enteredWord.addTextChangedListener(new TextWatcher() {
            @Override
            public void beforeTextChanged(CharSequence charSequence, int i, int i1, int i2) {
            }

            @Override
            public void onTextChanged(CharSequence charSequence, int i, int i1, int i2) {
                String word = enteredWord.getText().toString();
                if (word.length() == 0){
                    return;
                }
                String [] closestWord = returnSuggestions(word.toLowerCase(), 3);
                if (closestWord != null){
                    firstSuggestion.setText(closestWord[0]);
                    secondSuggestion.setText(closestWord[1]);
                    thirdSuggestion.setText(closestWord[2]);
                }
            }

            @Override
            public void afterTextChanged(Editable editable) {
            }
        });
    }
    public void selectWord(View view){
    	Button button = (Button) view;
	    enteredWord.setText(button.getText() + " ");
    	enteredWord.setSelection(enteredWord.getText().length());
    }
}

Levenshtein Distance Implementation

The implementation created in the previous tutorial for the Levenshtein distance is provided below. The primary method name is levenshtein(), and it accepts 2 arguments—token1 and token2, representing the 2 words to be compared. it returns an integer representing the distance between these 2 words.

    int levenshtein(String token1, String token2) {
        int[] distances = new int[token1.length() + 1];

        for (int t1 = 1; t1 <= token1.length(); t1++) {
            if (token1.charAt(t1 - 1) == token2.charAt(0)) {
                distances[t1] = calcMin(distances[t1 - 1], t1 - 1, t1);
            } else {
                distances[t1] = calcMin(distances[t1 - 1], t1 - 1, t1) + 1;
            }
        }

        int dist = 0;
        for (int t2 = 1; t2 < token2.length(); t2++) {
            dist = t2 + 1;
            for (int t1 = 1; t1 <= token1.length(); t1++) {
                int tempDist;
                if (token1.charAt(t1 - 1) == token2.charAt(t2)) {
                    tempDist = calcMin(dist, distances[t1 - 1], distances[t1]);
                } else {
                    tempDist = calcMin(dist, distances[t1 - 1], distances[t1]) + 1;
                }
                distances[t1 - 1] = dist;
                dist = tempDist;
            }
            distances[token1.length()] = dist;
        }
        return dist;
    }

    static int calcMin(int a, int b, int c) {
        if (a <= b && a <= c) {
            return a;
        } else if (b <= a && b <= c) {
            return b;
        } else {
            return c;
        }
    }

For our application, the distance is to be calculated between the user text input and all words in the dictionary. The closest words to the input are suggested to the user for autocompletion. The next section reads the dictionary and calculates the distance between all words within it, as well as the user input.

Comparing the Dictionary to the User Input and Making Suggestions

The dictionary used in this tutorial is available as a text file, which can be downloaded here. This section creates a method named returnSuggestions(), which accepts 2 arguments:

  1. word: The text entered by the user.
  2. numWords: Number of suggestions to return.

The method returns a String array representing the closest words to the user input.

Because the dictionary has 20,000 words, the method creates a String array of length 20,000 to hold all words in the dictionary alongside their distance to the user input.

After that, the text file is read according to the code below. The text file, in this example, is named 20k.txt, which is available in the assets directory of the Android project. The text file holds only 1 word per line, and thus the readLine() method is used to return each word in the line variable.

After the word is returned, the levenshtein() method is called to calculate the distance between the user input and the current word in the dictionary. The distance is saved into the integer variable named wordDistance. The distance, separated by – from the dictionary word, is inserted into the previously created array dictWordDist. For example, if the word is age and the distance is 5, then 5-age is inserted into the array. By doing this, we’ll be able to sort the words according to their distances.

        BufferedReader reader;
        int wordIdx = 0;
        try {
            int wordDistance;
//            https://github.com/first20hours/google-10000-english
            reader = new BufferedReader(new InputStreamReader(getAssets().open("20k.txt")));
            String line = reader.readLine();
            while (line != null) {
                wordDistance = levenshtein(line.trim(), word);
                dictWordDist[wordIdx] = wordDistance + "-" + line.trim();
                line = reader.readLine();
                wordIdx++;
            }
            reader.close();
        } catch (IOException e) {
            System.err.println("Failed to read the file.");
            e.printStackTrace();
            return null;
        }

After calculating the distances between all 20,000 dictionary words and the user input, the distances separated by – from the words will be available in the dictWordDist array. The next step is to sort the array in ascending order so that the closest words are available at the beginning of the array.

If the user entered the word hello, here are the 10 elements in the sorted array to show how things might look like.

The next step to create a String array that holds the closest words.

Here’s how the closest words are fetched. Using a for loop that loops through the first numWords words in the array dictWordDist, each element is returned into the variable currWordDist. To separate the textual word from its numerical distance, the split() method is used to return an array named wordDetails holding 2 elements. The first one is the distance, and the other one is the dictionary word. Only the word is inserted in the closestWords array.

At this moment, we successfully created a String array named closestWords that holds the closest words to the user input. These words will be used as suggestions from which the user can select to automatically complete the word being entered.

Here is the complete code of the returnSuggestions() method:

    String [] returnSuggestions(String word, int numWords){
        String[] dictWordDist = new String[20000];
        BufferedReader reader;
        int wordIdx = 0;
        try {
            int wordDistance;
//            https://github.com/first20hours/google-10000-english
            reader = new BufferedReader(new InputStreamReader(getAssets().open("20k.txt")));
            String line = reader.readLine();
            while (line != null) {
                wordDistance = levenshtein(line.trim(), word);
                dictWordDist[wordIdx] = wordDistance + "-" + line.trim();
                line = reader.readLine();
                wordIdx++;
            }
            reader.close();
        } catch (IOException e) {
            System.err.println("Failed to read the file.");
            e.printStackTrace();
            return null;
        }
        Arrays.sort(dictWordDist);
        String[] closestWords = new String[numWords];
        String currWordDist;
        for (int i = 0; i < numWords; i++) {
            currWordDist = dictWordDist[i];
            String[] wordDetails = currWordDist.split("-");
            closestWords[i] = wordDetails[1];
            System.out.println(wordDetails[0] + " " + wordDetails[1]);
        }
        return closestWords;
    }

At this point, the implementation of the activity is done. Here’s the complete code:

public class MainActivity extends AppCompatActivity {
    EditText enteredWord;
    Button firstSuggestion;
    Button secondSuggestion;
    Button thirdSuggestion;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        enteredWord = findViewById(R.id.inputWord);
        firstSuggestion = findViewById(R.id.firstSuggestion);
        secondSuggestion = findViewById(R.id.secondSuggestion);
        thirdSuggestion = findViewById(R.id.thirdSuggestion);
        enteredWord.addTextChangedListener(new TextWatcher() {
            @Override
            public void beforeTextChanged(CharSequence charSequence, int i, int i1, int i2) {
            }

            @Override
            public void onTextChanged(CharSequence charSequence, int i, int i1, int i2) {
                String word = enteredWord.getText().toString();
                if (word.length() == 0){
                    return;
                }
                String [] closestWord = returnSuggestions(word, 3);
                if (closestWord != null){
                    firstSuggestion.setText(closestWord[0]);
                    secondSuggestion.setText(closestWord[1]);
                    thirdSuggestion.setText(closestWord[2]);
                }
            }

            @Override
            public void afterTextChanged(Editable editable) {
            }
        });
    }

    int levenshtein(String token1, String token2) {
        int[] distances = new int[token1.length() + 1];

        for (int t1 = 1; t1 <= token1.length(); t1++) {
            if (token1.charAt(t1 - 1) == token2.charAt(0)) {
                distances[t1] = calcMin(distances[t1 - 1], t1 - 1, t1);
            } else {
                distances[t1] = calcMin(distances[t1 - 1], t1 - 1, t1) + 1;
            }
        }

        int dist = 0;
        for (int t2 = 1; t2 < token2.length(); t2++) {
            dist = t2 + 1;
            for (int t1 = 1; t1 <= token1.length(); t1++) {
                int tempDist;
                if (token1.charAt(t1 - 1) == token2.charAt(t2)) {
                    tempDist = calcMin(dist, distances[t1 - 1], distances[t1]);
                } else {
                    tempDist = calcMin(dist, distances[t1 - 1], distances[t1]) + 1;
                }
                distances[t1 - 1] = dist;
                dist = tempDist;
            }
            distances[token1.length()] = dist;
        }
        return dist;
    }

    static int calcMin(int a, int b, int c) {
        if (a <= b && a <= c) {
            return a;
        } else if (b <= a && b <= c) {
            return b;
        } else {
            return c;
        }
    }

    String [] returnSuggestions(String word, int numWords){
        String[] dictWordDist = new String[20000];
        BufferedReader reader;
        int wordIdx = 0;
        try {
            int wordDistance;
//            https://github.com/first20hours/google-10000-english
            reader = new BufferedReader(new InputStreamReader(getAssets().open("20k.txt")));
            String line = reader.readLine();
            while (line != null) {
                wordDistance = levenshtein(line.trim(), word);
                dictWordDist[wordIdx] = wordDistance + "-" + line.trim();
                line = reader.readLine();
                wordIdx++;
            }
            reader.close();
        } catch (IOException e) {
            System.err.println("Failed to read the file.");
            e.printStackTrace();
            return null;
        }
        Arrays.sort(dictWordDist);
        String[] closestWords = new String[numWords];
        String currWordDist;
        for (int i = 0; i < numWords; i++) {
            currWordDist = dictWordDist[i];
            String[] wordDetails = currWordDist.split("-");
            closestWords[i] = wordDetails[1];
            System.out.println(wordDetails[0] + " " + wordDetails[1]);
        }
        return closestWords;
    }

    public void selectWord(View view){
        Button button = (Button) view;
        enteredWord.setText(button.getText() + " ");
        enteredWord.setSelection(enteredWord.getText().length());
    }
}

Here is what the Android app looks like when the user enters beginn. Once an option is clicked, it will be transferred directly to the EditText view.

The app could also suggest corrections for mistakes:

Conclusion

This tutorial used the Levenshtein distance to create an Android app that generates suggestions to the user while typing that helps in automatically completing or correcting a single word.

The user text input is compared to a dictionary of words. The words with the least distance are suggested to the user. The user can then select a suggested word rather than typing it.

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

wix banner square