• Explore
  • About Us
  • Log In
  • Get Started
  • Explore
  • About Us
  • Log In
  • Get Started

Concurrent File Word Counter

You are building a tool to analyze large amounts of text data from multiple files. Your task is to create a program that counts occurrences of a specific word across multiple files concurrently. To optimize system resources, you need to limit how many files are processed concurrently.

Requirements

  • Accept a list of file paths and the target word as input.
  • Count occurrences of the word in each file concurrently using goroutines.
  • Limit the number of concurrently open files to 4 at a time.
  • Safely aggregate the total word count from all files without race conditions.
  • Log errors (e.g., file not found or read error) without terminating the program.

Program Structure

Project structure should look like this:

.
├── go.mod
├── main.go
└── main_test.go

1 directory, 3 files

Your program should have the following structure and implement these functions:

main.go

package main

import (
	"fmt"
	"os"
	"sync"
)

const MaxConcurrency = 4 // Maximum number of concurrent goroutines

func main() {
	// Validate the command-line arguments
	if len(os.Args) < 3 {
		fmt.Println("Usage: go run main.go <file1> <file2> ... <target_word>")
		return
	}

	// Parse file paths and target word
	files := os.Args[1 : len(os.Args)-1] // All arguments except the last one are file paths
	targetWord := os.Args[len(os.Args)-1] // The last argument is the target word

	fmt.Printf("Searching for the word \"%s\" in %d files...\n", targetWord, len(files))

	// Call the function to count words concurrently
	totalCount, err := countWordConcurrently(files, targetWord)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	// Print the total count
	fmt.Printf("Total occurrences of \"%s\": %d\n", targetWord, totalCount)
}

// countWordConcurrently processes multiple files concurrently and counts occurrences of the target word.
// Uses goroutines and channels to process files and aggregates the total count safely.
func countWordConcurrently(files []string, targetWord string) (int, error) {
	// TODO: Implement the logic to:
	// - Use a buffered channel to limit the number of concurrently open files.
	// - Spawn worker goroutines to process files.
	// - Safely aggregate the total word count using sync.Mutex or atomic operations.
	return 0, nil
}

// countWordInFile counts the occurrences of the target word in a single file.
func countWordInFile(filePath, targetWord string) (int, error) {
	// TODO: Implement the logic to:
	// - Open the file at the given file path.
	// - Read the file line by line.
	// - Count occurrences of the target word in each line.
	// - Return the total count for the file and handle any file errors.
	return 0, nil
}

// countWordInLine counts occurrences of the target word in a single line of text.
// The comparison should be case-insensitive.
func countWordInLine(line, targetWord string) int {
	// TODO: Implement the logic to:
	// - Split the line into words.
	// - Count occurrences of the target word, ignoring case.
	return 0
}

main_test.go

Copy below test file to main_test.go and test your implementation. All test cases must pass.

package main

import (
	"os"
	"path/filepath"
	"testing"
)

func TestCountWordConcurrently(t *testing.T) {
	// Setup: Create temporary test files
	inputDir := "test_files"
	if err := os.MkdirAll(inputDir, 0755); err != nil {
		t.Fatalf("Failed to create input directory: %v", err)
	}
	defer os.RemoveAll(inputDir)

	files := []string{
		filepath.Join(inputDir, "file1.txt"),
		filepath.Join(inputDir, "file2.txt"),
		filepath.Join(inputDir, "file3.txt"),
	}

	content := []string{
		"error error test\nhello error",
		"this is an error line\nanother error error line",
		"no error here\njust some words",
	}

	// Write content to files
	for i, file := range files {
		if err := os.WriteFile(file, []byte(content[i]), 0644); err != nil {
			t.Fatalf("Failed to create test file %s: %v", file, err)
		}
	}

	// Test: Count the word "error"
	totalCount, err := countWordConcurrently(files, "error")
	if err != nil {
		t.Fatalf("Error counting words: %v", err)
	}

	// Verify
	expectedCount := 7 // "error" appears 7 times across the test files
	if totalCount != expectedCount {
		t.Errorf("Expected %d occurrences, but got %d", expectedCount, totalCount)
	}
}

func TestCountWordInFile(t *testing.T) {
	// Setup: Create a temporary file
	filePath := "test_file.txt"
	content := "hello world\nthis is a test\nerror error\nanother error"

	if err := os.WriteFile(filePath, []byte(content), 0644); err != nil {
		t.Fatalf("Failed to create test file: %v", err)
	}
	defer os.Remove(filePath)

	// Test: Count the word "error"
	count, err := countWordInFile(filePath, "error")
	if err != nil {
		t.Fatalf("Error counting words: %v", err)
	}

	// Verify
	expectedCount := 3 // "error" appears 3 times in the file
	if count != expectedCount {
		t.Errorf("Expected %d occurrences, but got %d", expectedCount, count)
	}
}

func TestCountWordInLine(t *testing.T) {
	// Test: Count the word "error" in a single line
	line := "error test error hello error"
	count := countWordInLine(line, "error")

	// Verify
	expectedCount := 3 // "error" appears 3 times in the line
	if count != expectedCount {
		t.Errorf("Expected %d occurrences, but got %d", expectedCount, count)
	}
}

To run test cases, execute:

go test -race -v

Acceptance Criteria

  • The program must handle concurrent processing of files without race conditions.
  • No more than 4 files should be processed at the same time.
  • The total word count must be accurate.
  • Errors (e.g., missing files) should not terminate the program but should be logged.
  • All test cases must pass.

Hints

  • 💡 Use sync.Mutex to protect shared resources (e.g., total count).
  • 💡 Use buffered channels to limit the number of concurrently open files.
  • 🚫 Avoid global variables; encapsulate logic in functions.

Example Input/Output

Input:

$ go run main.go file1.txt file2.txt file3.txt target_word

Output:

Searching for the word "target_word" in 3 files...
Total occurrences of "target_word": 42

Error Log Example:

Error: Failed to open file "missing.txt": file not found.

    Golang (Go) for Production Systems

    Unlock All Exercises

  • Getting Started
    • Important - Please Read
  • Error Handling
    • Finance Transaction Error
    • Resilient Retry Mechanism
    • Graceful Panic Recovery
    • Error Wrapping and Unwrapping
    • Aggregated Validation Errors
    • Transaction Rollback on Failure
  • Interfaces
    • Design a Payment Gateway Interface
    • Pluggable Logging System
    • Configurable Notification System
    • Dynamic Data Serializer
    • Dynamic Plugin System
  • Concurrency
    • File Word Counter
    • Traffic Lights Controller
    • Parking Lot Manager
    • Real-time Auction System
    • Safe Bank Account Balance Update
    • Build a Buffered Logger
    • Build a TTL Cache
    • Build an Elevator Control System
    • Debug and Fix Race Conditions - Part 1
    • Debug and Fix Race Conditions - Part 2
    • Debug and Fix Race Conditions - Part 3
    • Dynamic Feature Flag System
    • Real-Time Multiplayer Matchmaking
    • Lazy Database Connection
    • Distributed Task Deduplication
    • High Performance Request Counter
  • Core Networking
    • Simple TCP Echo Server
    • TCP Number Guessing Game
    • Looking Up Domain Information
    • UDP Time Broadcast Service
    • UDP Ping-Pong Client and Server
    • Building a Simple TCP Chat Server and Client
  • Winding Up
    • Final Notes