java – Multithreaded page parsing


For an online game, I am writing a program that will allow you to receive useful data. There are two methods that return the cost of a particular weapon on the market for one country and for all countries (74) together.

public void getCheapestWeapon(Weapon weapon) {
    //long result = 0;
    Map<String, Integer> countryAndId = allCountries.getCountryAndId();
    for (Map.Entry<String, Integer> item : countryAndId.entrySet()) {
        //long start = System.currentTimeMillis();
        getCheapestWeapon(item.getKey(), weapon);
        //long finish = System.currentTimeMillis();
        //System.out.println(finish - start);
        //result += finish - start;

public void getCheapestWeapon(String country, Weapon weapon) {
    long start = System.currentTimeMillis();
    String link = MARKET_LINK
            + allCountries.getCountryId(country)
            + "/" + weapon.getPlaceOnMarket()
            + "/" + weapon.getQuality()
            + "/" + ADDITIONAL_LINK;
    try {
        Document page = Jsoup.connect(link)
                .userAgent("Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36")
        Elements table = page.getElementsByClass("price_sorted");
        Element row ="tr").first();
        Elements columns = row.getElementsByTag("td");
        Element priceCell = columns.get(3);
        String priceStr = priceCell.text();
        double price = Double.parseDouble(priceStr.substring(0, priceStr.length() - 4));
        System.out.println(country + " - " + price);
    } catch (IOException e) {
    //long finish = System.currentTimeMillis();
    //System.out.println(finish - start);

The problem is that when receiving data for all countries, the parsing time is on average 16-18 seconds, which, you see, is a lot. I tried to put getCheapestWeapon(item.getKey(), weapon) in a separate thread in the first method. It didn't do anything good. How can I implement simultaneous parsing of all 74 pages?


In its current form (the results of getCheapestWeapon are simply printed to the console), the easiest way is to use the standard ExecutorService

public void getCheapestWeapon(Weapon weapon) {
    final int THREADS = 4;
    ExecutorService pool = Executors.newFixedThreadPool( THREADS );

    Map<String, Integer> countryAndId = allCountries.getCountryAndId();
    for (Map.Entry<String, Integer> item : countryAndId.entrySet()) {
        pool.execute( () -> getCheapestWeapon( item.getKey(), weapon ) );


If the operation is not one-time, then you need to remove the creation and stop of the service from the method. Because the main delay, most likely due to the network (74 requests, even if the packet goes from the client to the server 50ms, it will take 7 seconds), the number of processes in the service may exceed the number of cores (select experimentally).

Alternatively, try getting more data from the server with fewer requests, for example by removing the country filter if you want the lowest price.

Scroll to Top