How to resolve the algorithm Rosetta Code/Find bare lang tags step by step in the Java programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm Rosetta Code/Find bare lang tags step by step in the Java programming language

Table of Contents

Problem Statement

Find all     tags without a language specified in the text of a page.
Display counts by language section: should display something like

Allow multiple files to be read.   Summarize all results by language:

Use the   Media Wiki API   to test actual RC tasks.

Let's start with the solution:

Step by Step solution about How to resolve the algorithm Rosetta Code/Find bare lang tags step by step in the Java programming language

The provided code snippet, written in Java, uses the Java 9 HTTP client to connect to the Rosetta Code wiki and retrieve information about programming tasks. The goal of the code is to identify the frequency of bare tags (i.e., tags without any surrounding text) for various programming languages.

Here's a step-by-step explanation of the code:

  1. Importing necessary libraries: The code begins by importing several necessary libraries, including the java.net.URI, java.net.http package for sending HTTP requests, and java.util for working with collections and atomic references.

  2. Defining constants: The code defines several constants:

    • BASE: The base URL of the Rosetta Code website (http://rosettacode.org).
    • TITLE_PATTERN: A regular expression pattern for extracting page titles from the API response.
    • HEADER_PATTERN: A regular expression pattern for extracting headers (language names) from the wiki pages.
    • BARE_PREDICATE: A predicate (function that returns a boolean) for identifying lines containing bare tags (<lang>).
  3. Setting up the HTTP client: The code creates an instance of the HTTP client (HttpClient) using the HttpClient.newBuilder() method.

  4. Fetching page titles: The code sends an HTTP GET request to the Rosetta Code API to retrieve a list of pages in the "Programming_Tasks" category. The response is parsed to extract the page titles.

  5. Iterating over page titles: For each page title obtained in the previous step, the code sends another HTTP GET request to retrieve the raw content of that page.

  6. Parsing page content: The raw content of each page is parsed to identify bare tags and the corresponding programming language (extracted from headers).

  7. Counting bare tags: The code maintains a map (countMap) to track the count of bare tags for each programming language.

  8. Printing results: Finally, the code prints the count of bare tags for each programming language.

In summary, this code uses the Java HTTP client to retrieve and parse information from the Rosetta Code wiki to identify the frequency of bare tags for different programming languages. It demonstrates how to make HTTP requests, parse HTML content, and work with collections and atomic references in Java.

Source code in the java programming language

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.Predicate;
import java.util.regex.Pattern;
import java.util.stream.Collectors;

public class FindBareTags {
    private static final String BASE = "http://rosettacode.org";

    private static final Pattern TITLE_PATTERN = Pattern.compile("\"title\": \"([^\"]+)\"");
    private static final Pattern HEADER_PATTERN = Pattern.compile("==\\{\\{header\\|([^}]+)}}==");
    private static final Predicate<String> BARE_PREDICATE = Pattern.compile("<lang>").asPredicate();

    public static void main(String[] args) throws Exception {
        var client = HttpClient.newBuilder().build();

        URI titleUri = URI.create(BASE + "/mw/api.php?action=query&list=categorymembers&cmtitle=Category:Programming_Tasks");
        var titleRequest = HttpRequest.newBuilder(titleUri).GET().build();

        var titleResponse = client.send(titleRequest, HttpResponse.BodyHandlers.ofString());
        if (titleResponse.statusCode() == 200) {
            var titleBody = titleResponse.body();

            var titleMatcher = TITLE_PATTERN.matcher(titleBody);
            var titleList = titleMatcher.results().map(mr -> mr.group(1)).collect(Collectors.toList());

            var countMap = new HashMap<String, Integer>();
            for (String title : titleList) {
                var pageUri = new URI("http", null, "//rosettacode.org/wiki", "action=raw&title=" + title, null);
                var pageRequest = HttpRequest.newBuilder(pageUri).GET().build();
                var pageResponse = client.send(pageRequest, HttpResponse.BodyHandlers.ofString());
                if (pageResponse.statusCode() == 200) {
                    var pageBody = pageResponse.body();

                    AtomicReference<String> language = new AtomicReference<>("no language");
                    pageBody.lines().forEach(line -> {
                        var headerMatcher = HEADER_PATTERN.matcher(line);
                        if (headerMatcher.matches()) {
                            language.set(headerMatcher.group(1));
                        } else if (BARE_PREDICATE.test(line)) {
                            int count = countMap.getOrDefault(language.get(), 0) + 1;
                            countMap.put(language.get(), count);
                        }
                    });
                } else {
                    System.out.printf("Got a %d status code%n", pageResponse.statusCode());
                }
            }

            for (Map.Entry<String, Integer> entry : countMap.entrySet()) {
                System.out.printf("%d in %s%n", entry.getValue(), entry.getKey());
            }
        } else {
            System.out.printf("Got a %d status code%n", titleResponse.statusCode());
        }
    }
}


  

You may also check:How to resolve the algorithm Sierpinski triangle step by step in the Liberty BASIC programming language
You may also check:How to resolve the algorithm Bulls and cows step by step in the D programming language
You may also check:How to resolve the algorithm Priority queue step by step in the R programming language
You may also check:How to resolve the algorithm Monte Carlo methods step by step in the Jsish programming language
You may also check:How to resolve the algorithm A+B step by step in the ALGOL 68 programming language