Interface IHttpFetcher

    • Method Detail

      • getUserAgent

        String getUserAgent()
      • fetch

        IHttpFetchResponse fetch​(CrawlDoc doc,
                                 HttpMethod httpMethod)
                          throws HttpFetchException

        Performs an HTTP request for the supplied document reference and HTTP method.

        For each HTTP method supported, implementors should do their best to populate the document and its CrawlDocInfo with as much information they can.

        Unsupported HTTP methods should return an HTTP response with the CrawlState.UNSUPPORTED state. To prevent userse having to configure multiple HTTP clients, implementors should try to support both the GET and HEAD methods. POST is only used in special cases and is often not used during a crawl session.

        A null method is treated as a GET.

        Parameters:
        doc - document to fetch or to use to make the request.
        httpMethod - HTTP method
        Returns:
        an HTTP response
        Throws:
        HttpFetchException - problem when fetching the document
        See Also:
        HttpFetchResponseBuilder.unsupported()