no copyright infringement is intended
In this example I build off of the previous example, by adding an Oracle CLOB field.
All in all, this is pretty simple and requires just a few updates:
- Update the Oracle table to have a CLOB column
- Update the collections data config file to use the ClobTransformer
- Update the managed-schema to use a TextField data type
Update the Oracle table to have a CLOB column
CREATE TABLE "[schema]"."SOLR_TEST"
(
"ID" NUMBER generated always AS identity,
"TIME_STAMP" timestamp (6) DEFAULT sys_extract_utc(systimestamp),
"CATEGORY" VARCHAR2(255 byte),
"TYPE" VARCHAR2(255 byte),
"SERVERNAME" VARCHAR2(255 byte),
"CODE" VARCHAR2(255 byte),
"MSG" CLOB
);
(
"ID" NUMBER generated always AS identity,
"TIME_STAMP" timestamp (6) DEFAULT sys_extract_utc(systimestamp),
"CATEGORY" VARCHAR2(255 byte),
"TYPE" VARCHAR2(255 byte),
"SERVERNAME" VARCHAR2(255 byte),
"CODE" VARCHAR2(255 byte),
"MSG" CLOB
);
Update the collections data config file to use the ClobTransformer
<dataconfig>
<datasource driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:...[put your connections string here] />
<document>
<entity name ="SOLR_TEST" transformer ="ClobTransformer" "query="SELECT ID, to_char(TIME_STAMP,'YYYY-MM-DD"T"HH24:MI:SS"Z"') AS TIME_STAMP, CATEGORY, TYPE, SERVERNAME, CODE, MSG FROM SOLR_TEST" deltaImportQuery="SELECT ID, to_char(TIME_STAMP,'YYYY-MM-DD"T"HH24:MI:SS"Z"') AS TIME_STAMP, CATEGORY, TYPE, SERVERNAME, CODE, MSG FROM SOLR_TEST WHERE ID = ${dataimporter.delta.ID}" deltaQuery="SELECT ID
<datasource driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:...[put your connections string here] />
<document>
<entity name =
<field column = "TIME_STAMP
<field column = "
<field column = "
<field column = "SERVERNAME
<field column = "CODE
<field column = "MSG
</dataconfig>
Update the managed-schema to use a TextField data type
...
<field name="msg" type="text_en" indexed="true" stored="true" multiValued="false" />
...
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.EnglishPossessiveFilterFactory" /> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" /> <filter class="solr.PorterStemFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.EnglishPossessiveFilterFactory" /> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" /> <filter class="solr.PorterStemFilterFactory" /> </analyzer> </fieldType>
...
Full disclosure, I built off of the "example-data-driven-schema" and the field type comes directly from there.
That's it! You can now run a full import, make some additions, and run a delta. The most difficult part of this process, for me, was getting 4000+ characters inserted into the clob field in the Oracle table. Here's what that looks like:
DECLARE
my_clob CLOB;
BEGIN
my_clob := '4000 plus characters...';
INSERT INTO [SCHEMA].solr_test
(
category,
TYPE,
servername,
code,
msg
)
VALUES
(
'Manual',
'SQL Insert',
'hostname',
'BEA-100000',
my_clob
);
END;
Great, this worked for me. One note: It's important to use column="COLUMN" (upper case) - otherwise it does not work. Cheers!
ReplyDelete