java.lang.Object
- com.norconex.collector.http.link.AbstractLinkExtractor
- - com.norconex.collector.http.link.AbstractTextLinkExtractor
  - - com.norconex.collector.http.link.impl.XMLFeedLinkExtractor

All Implemented Interfaces:

ILinkExtractor, IXMLConfigurable
```
public class XMLFeedLinkExtractor
extends AbstractTextLinkExtractor
```
Link extractor for extracting links out of RSS and Atom XML feeds. It extracts the content of <link> tags. If you need more complex extraction, consider using RegexLinkExtractor or creating your own ILinkExtractor implementation.

Applicable documents

By default, this extractor only will be applied on documents matching one of these content types:

Referrer data

The following referrer information is stored as metadata in each document represented by the extracted URLs:
- Referrer reference: The reference (URL) of the page where the link to a document was found. Metadata value is HttpDocMetadata.REFERRER_REFERENCE.
XML configuration usage:
```
<extractor
    class="com.norconex.collector.http.link.impl.XMLFeedLinkExtractor">
  <fieldMatcher>
    (optional expression for fields used for links extraction instead
     of the document stream)
  </fieldMatcher>
</extractor>
```
XML usage example:
```
<extractor
    class="com.norconex.collector.http.link.impl.XMLFeedLinkExtractor">
  <restrictTo
      field="document.reference"
      method="regex">
    .*rss$
  </restrictTo>
</extractor>
```
The above example specifies this extractor should only apply on documents that have their URL ending with "rss" (in addition to the default content types supported).
Since:

2.7.0

Author:

Pascal Essiembre

Constructor Summary

Constructors
Constructor Description

XMLFeedLinkExtractor()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`boolean`	`equals(Object other)`
`void`	`extractTextLinks(Set<Link> links, HandlerDoc doc, Reader reader)`
`int`	`hashCode()`
`protected void`	`loadTextLinkExtractorFromXML(XML xml)`	Loads configuration settings specific to the implementing class.
`protected void`	`saveTextLinkExtractorToXML(XML xml)`	Saves configuration settings specific to the implementing class.
`String`	`toString()`

Methods inherited from class com.norconex.collector.http.link.AbstractTextLinkExtractor
extractLinks, getFieldMatcher, loadLinkExtractorFromXML, saveLinkExtractorToXML, setFieldMatcher

Methods inherited from class com.norconex.collector.http.link.AbstractLinkExtractor
addRestriction, addRestrictions, clearRestrictions, extractLinks, getRestrictions, loadFromXML, removeRestriction, removeRestriction, saveToXML, setRestrictions

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - XMLFeedLinkExtractor
```
public XMLFeedLinkExtractor()
```
- Method Detail
  - extractTextLinks
```
public void extractTextLinks(Set<Link> links,
                             HandlerDoc doc,
                             Reader reader)
                      throws IOException
```
    Specified by:
    
    extractTextLinks in class AbstractTextLinkExtractor
    
    Throws:
    
    IOException
  - loadTextLinkExtractorFromXML
```
protected void loadTextLinkExtractorFromXML(XML xml)
```
    Description copied from class: AbstractTextLinkExtractor
    
    Loads configuration settings specific to the implementing class.
    
    Specified by:
    
    loadTextLinkExtractorFromXML in class AbstractTextLinkExtractor
    
    Parameters:
    
    xml - XML configuration
  - saveTextLinkExtractorToXML
```
protected void saveTextLinkExtractorToXML(XML xml)
```
    Description copied from class: AbstractTextLinkExtractor
    
    Saves configuration settings specific to the implementing class.
    
    Specified by:
    
    saveTextLinkExtractorToXML in class AbstractTextLinkExtractor
    
    Parameters:
    
    xml - the XML
  - equals
```
public boolean equals(Object other)
```
    Overrides:
    
    equals in class AbstractTextLinkExtractor
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class AbstractTextLinkExtractor
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class AbstractTextLinkExtractor

Class XMLFeedLinkExtractor

Applicable documents

Referrer data

XML configuration usage:

XML usage example:

Constructor Summary

Method Summary

Methods inherited from class com.norconex.collector.http.link.AbstractTextLinkExtractor

Methods inherited from class com.norconex.collector.http.link.AbstractLinkExtractor

Methods inherited from class java.lang.Object

Constructor Detail

XMLFeedLinkExtractor

Method Detail

extractTextLinks

loadTextLinkExtractorFromXML

saveTextLinkExtractorToXML

equals

hashCode

toString