C program to find first occurrence of a word in a string

Category: C Program

Learn how to write a C program to find the first occurrence of a word in a string without using built-in functions. This guide includes a step-by-step explanation, code example, and usage scenario for efficient text search in C.

In text processing, finding the first occurrence of a specific word within a string is a common task. This article will guide you through writing a C program to locate the first occurrence of a word in a given string without using built-in functions like strstr. Additionally, we'll provide a sample using strstr for comparison.

Problem Statement

Given a string and a word, we need to find the position of the first occurrence of the word within the string. If the word is found, the program should return the starting index of the word; otherwise, it should indicate that the word is not present in the string.

Approach

To solve this problem, we can follow these steps:

  1. Take the input string from the user.
  2. Take the input word to be searched by the user.
  3. Iterate through the string and check for the occurrence of the word character by character.
  4. Print the position of the first occurrence or indicate if the word is not found.

Write a C program to find first occurrence of a word in a string

Here is a simple C program to find the first occurrence of a word in a string without using built-in functions:

#include <stdio.h>
#include <string.h>

int main() {
    char str[200], word[50];
    int strLen, wordLen, i, j, found;

    // Taking string input from the user
    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);

    // Removing the newline character added by fgets
    str[strcspn(str, "\n")] = '\0';

    // Taking the word input from the user
    printf("Enter the word to find: ");
    fgets(word, sizeof(word), stdin);

    // Removing the newline character added by fgets
    word[strcspn(word, "\n")] = '\0';

    strLen = strlen(str);
    wordLen = strlen(word);

    // Finding the first occurrence of the word in the string
    for (i = 0; i <= strLen - wordLen; i++) {
        found = 1;
        for (j = 0; j < wordLen; j++) {
            if (str[i + j] != word[j]) {
                found = 0;
                break;
            }
        }
        if (found) {
            printf("The first occurrence of the word '%s' is at index %d.", word, i);
            return 0;
        }
    }

    printf("The word '%s' is not found in the string.\n", word);

    return 0;
}

Output

Enter a string: learn programming at ProCoding
Enter the word to find: at
The first occurrence of the word 'at' is at index 18.

Explanation of the Program

  1. Taking String Input:

    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);
    
    • fgets reads the input string from the user, ensuring that it does not exceed the buffer size.
  2. Removing the Newline Character from the String:

    str[strcspn(str, "\n")] = '\0';
    
    • strcspn returns the index of the first occurrence of \n (newline character).
    • We replace the newline character with \0 (null terminator) to clean up the input string.
  3. Taking the Word Input:

    printf("Enter the word to find: ");
    fgets(word, sizeof(word), stdin);
    
    • fgets reads the word to be searched from the user.
  4. Removing the Newline Character from the Word:

    word[strcspn(word, "\n")] = '\0';
    
    
    • Similarly, we remove the newline character from the input word.
  5. Finding the First Occurrence:

    for (i = 0; i <= strLen - wordLen; i++) {
        found = 1;
        for (j = 0; j < wordLen; j++) {
            if (str[i + j] != word[j]) {
                found = 0;
                break;
            }
        }
        if (found) {
            printf("The first occurrence of the word '%s' is at index %d.", word, i);
            return 0;
        }
    }
    
    • Iterate through the string from the beginning to the point where the remaining characters are less than the length of the word.
    • For each position in the string, compare the substring with the word.
    • If all characters match, set found to 1 and print the starting index.
    • If a mismatch is found, continue to the next position in the string.

Example Usage

If the user inputs:

  • String: "The quick brown fox jumps over the lazy dog"
  • Word to find: "fox"

The program will output:

The first occurrence of the word 'fox' is at index 16.

Using strstr for Finding the First Occurrence

For comparison, here is how you can achieve the same result using the strstr function:

#include <stdio.h>
#include <string.h>

int main() {
    char str[200], word[50];
    char *pos;

    // Taking string input from the user
    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);

    // Removing the newline character added by fgets
    str[strcspn(str, "\n")] = '\0';

    // Taking the word input from the user
    printf("Enter the word to find: ");
    fgets(word, sizeof(word), stdin);

    // Removing the newline character added by fgets
    word[strcspn(word, "\n")] = '\0';

    // Finding the first occurrence of the word in the string using strstr
    pos = strstr(str, word);

    if (pos) {
        int index = pos - str;
        printf("The first occurrence of the word '%s' is at index %d.", word, index);
    } else {
        printf("The word '%s' is not found in the string.", word);
    }

    return 0;
}

Example Usage with strstr

If the user inputs:

  • String: "The quick brown fox jumps over the lazy dog"
  • Word to find: "fox"

The program will output:

The first occurrence of the word 'fox' is at index 16.

The strstr function is a standard library function in C used to find the first occurrence of a substring in a string. It is declared in the <string.h> header file. The function returns a pointer to the first occurrence of the substring in the string, or NULL if the substring is not found.

Syntax

char *strstr(const char *haystack, const char *needle);
  • haystack: This is the main string in which we want to search for the substring.
  • needle: This is the substring that we want to find in the main string.

Working of strstr

  1. Initialization:
    • strstr starts by taking the two strings: haystack (the main string) and needle (the substring to be found).
  2. Edge Cases:
    • If needle is an empty string, strstr returns haystack immediately because an empty substring is trivially found at the start of any string.
  3. Search Process:
    • The function iterates through each character of haystack.
    • For each position in haystack, it checks if the substring needle starts at that position.
    • This is done by comparing the characters of needle with the characters in haystack starting from the current position.
  4. Character Comparison:
    • If the characters match, it continues comparing the next characters.
    • If all characters of needle match consecutively, strstr returns a pointer to the starting position of needle in haystack.
  5. No Match:
    • If a mismatch is found, strstr moves to the next character in haystack and repeats the process.
    • If the end of haystack is reached without finding needle, strstr returns NULL.

Advantages of strstr

  • Simplicity: It provides a straightforward and concise way to find a substring within a string.
  • Efficiency: It is optimized for performance in most standard library implementations.

Limitations of strstr

  • Case Sensitivity: strstr is case-sensitive. For case-insensitive search, you must use other functions like strcasestr (available in some environments) or implement custom logic.
  • No Overlapping Substrings: It only finds the first occurrence of the substring. If you need to find all occurrences, additional logic is required.

Recommended Posts