
We cannot filter out the files that we required from listhdfs processor but every flowfile from listhdfs processor will have filename attribute associated with the flowfile. This processor now fetches only /user/yashu/folder2/.1 file from directory.Ĭonfigure your directory path in list HDFS processor and this processor will list all the files that are in the directory. (or) if you want to delete the file after fetching then keep property to false.ģ.In File Filter Regex give the regex that matches your required filenames.Įx:- i need only files starting with 2011 so i have given regex as 2011.* Use GetHDFS processor and change property Keep Source File to true by default is false.//if you want to keep the source in the directory then change property to true.

In the Property Name box, type the name of the attribute that you want to add to your FlowFiles, for example idol.type, and then click OK.The Configure Processor dialog box opens. Right-click the UpdateAttribute processor and click Configure. To add attributes to FlowFiles using the UpdateAttribute processor For more information about these attributes, see Introduction to FlowFiles and Documents. The GetFile processor creates FlowFiles where the body of the FlowFile contains the binary content from a file, so the idol.type attribute is set to contentfile. If a file is retrieved again and a new document is indexed, IDOL can replace the old data because the new document will have the same reference.

In most cases, can be set to Add, so that data is added to the IDOL index. The attributes that you can use to populate the reference will vary depending on which processor you use to retrieve data. This example uses FlowFile attributes ("path" and "filename"), that have been set by the GetFile processor, to populate the IDOL document reference. The following image shows part of the configuration for the UpdateAttribute processor. The output from the UpdateAttribute processor is suitable to be routed to IDOL NiFi Ingest processors, in this example a KeyViewExtractFiles processor. The following example shows a dataflow with a GetFile processor to retrieve files, followed by an UpdateAttribute processor to set the required attributes on each FlowFile.

- the indexing operation to perform with the data, for example Add, Update, or Remove.Micro Focus recommends that you create unique references, so that IDOL Content can de-duplicate documents based on the document reference.

idol.reference - a document reference.You can use the Apache NiFi processors to retrieve data, but before routing the FlowFiles to an IDOL NiFi Ingest processor, you must ensure that several attributes are set.Īn IDOL NiFi Ingest processor expects the following attributes to be set on each FlowFile. For example, there is a GetFile processor that can retrieve files from a directory. The Apache NiFi framework includes built-in processors that can retrieve data (processors that are supplied with the Apache NiFi framework, rather than with IDOL NiFi Ingest).
