public final class ApacheHttpUtil extends Object
Modifier and Type | Method and Description |
---|---|
static void |
applyContentTypeAndCharset(String value,
CrawlDocInfo docInfo)
Applies the
Content-Type HTTP response header
on the supplied document info. |
static boolean |
applyResponseContent(org.apache.http.HttpResponse response,
CrawlDoc doc)
Applies the HTTP response content to a document if such content exists.
|
static void |
applyResponseHeaders(org.apache.http.HttpResponse response,
String prefix,
CrawlDoc doc)
Applies the HTTP response headers to a document.
|
static void |
authenticateUsingForm(org.apache.http.client.HttpClient httpClient,
HttpAuthConfig authConfig) |
static org.apache.http.client.methods.HttpRequestBase |
createUriRequest(String url,
HttpMethod method)
Creates an HTTP request.
|
static org.apache.http.client.methods.HttpRequestBase |
createUriRequest(String url,
String method)
Creates an HTTP request.
|
static void |
setRequestIfModifiedSince(org.apache.http.HttpRequest request,
CrawlDoc doc)
Sets the
If-Modified-Since HTTP request header based
on document cached last crawled date (if any). |
static void |
setRequestIfNoneMatch(org.apache.http.HttpRequest request,
CrawlDoc doc)
Sets the ETag
If-None-Match HTTP request header based
on document cached ETag value (if any). |
public static boolean applyResponseContent(org.apache.http.HttpResponse response, CrawlDoc doc) throws IOException
Applies the HTTP response content to a document if such content exists. The stream is fully downloaded and associated with a document.
response
- the HTTP responsedoc
- document to apply headers ontrue
if there was content to applyIOException
- could not read existing contentpublic static void applyResponseHeaders(org.apache.http.HttpResponse response, String prefix, CrawlDoc doc)
Applies the HTTP response headers to a document. This method will
do its best to derive relevant information from the HTTP headers
that can be set on the document HttpDocInfo
:
In addition, all HTTP headers will be added to the document metadata, with an optional prefix.
response
- the HTTP responseprefix
- optional metadata prefix for all HTTP response headersdoc
- document to apply headers onpublic static void applyContentTypeAndCharset(String value, CrawlDocInfo docInfo)
Content-Type
HTTP response header
on the supplied document info. It does so by extracting both
the content type and charset from the value, and sets them by invoking
DocInfo.setContentType(ContentType)
and
DocInfo.setContentEncoding(String)
.
This method is automatically invoked by
applyResponseHeaders(HttpResponse, String, CrawlDoc)
when encountering a content type header.value
- value to parse and set.docInfo
- document infopublic static void setRequestIfModifiedSince(org.apache.http.HttpRequest request, CrawlDoc doc)
If-Modified-Since
HTTP request header based
on document cached last crawled date (if any).request
- HTTP requestdoc
- documentpublic static void setRequestIfNoneMatch(org.apache.http.HttpRequest request, CrawlDoc doc)
If-None-Match
HTTP request header based
on document cached ETag value (if any).request
- HTTP requestdoc
- documentpublic static org.apache.http.client.methods.HttpRequestBase createUriRequest(String url, String method)
url
- the request target URLmethod
- HTTP method (defaults to GET if null
)public static org.apache.http.client.methods.HttpRequestBase createUriRequest(String url, HttpMethod method)
url
- the request target URLmethod
- HTTP method (defaults to GET if null
)public static void authenticateUsingForm(org.apache.http.client.HttpClient httpClient, HttpAuthConfig authConfig) throws IOException, URISyntaxException
IOException
URISyntaxException
Copyright © 2009–2023 Norconex Inc.. All rights reserved.