tag:blogger.com,1999:blog-68182579840189164192024-03-19T04:08:56.191-07:00Jon Morisi's SQL BlogJon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.comBlogger46125tag:blogger.com,1999:blog-6818257984018916419.post-80942797381194078742022-06-23T15:08:00.000-07:002022-06-23T15:08:15.012-07:00SSIS Failed to retrieve data for this request The RPC Server is unavailable (HRESULT 0X800706BA)<p> </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEi3_uloaVFoMy0E5dMICG40tHvyw9a7l-EhqAf8hTuCWK_wnvZ6Emv3bO0n4_ND-fCkH0uLX0uSbaxigtJvbOx7mT9Fuayl2F3ihOikLM5oZz8UH2hw9wpzjR2WzS-IIAlumFgaDzfR7QCB26RhAou1c8ojHWU6LDRQioM3hokiuBOmekY7wtiReLh7" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="219" data-original-width="600" height="117" src="https://blogger.googleusercontent.com/img/a/AVvXsEi3_uloaVFoMy0E5dMICG40tHvyw9a7l-EhqAf8hTuCWK_wnvZ6Emv3bO0n4_ND-fCkH0uLX0uSbaxigtJvbOx7mT9Fuayl2F3ihOikLM5oZz8UH2hw9wpzjR2WzS-IIAlumFgaDzfR7QCB26RhAou1c8ojHWU6LDRQioM3hokiuBOmekY7wtiReLh7" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">I just spent a long slog sorting out why I could not connect to my SSIS instance remotely. I work in a very secure environment requiring network approval for any and all ports. According to the following article, I was under the impression that a request to open incoming traffic on port 135, to a specific IP, would allow SQL Server Management Studio, on that specific IP, to connect remotely to SSIS:</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><a href="https://docs.microsoft.com/en-us/sql/sql-server/install/configure-the-windows-firewall-to-allow-sql-server-access?redirectedfrom=MSDN&view=sql-server-ver16#BKMK_ssis">https://docs.microsoft.com/en-us/sql/sql-server/install/configure-the-windows-firewall-to-allow-sql-server-access?redirectedfrom=MSDN&view=sql-server-ver16#BKMK_ssis</a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">After opening port 135, I was receiving the error message in the title of this article:</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgvKUzdPNb3FnsbO39oPV3yolOKCP0rwWyl_lmABVY_VUX8qZhTxQIgyGJWRXtjBz_uQx3Ola7WKna4ox7suZG2xn1-HTankPy3eoUpxFPGKxgd7odiTMXoGNanqffRgs3bm5n7078_Z2RJ4_iUZ_iLbpod__gIB4AcMGWnBxM7IcvDW_-b5Qrbb7D4" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="244" data-original-width="604" src="https://blogger.googleusercontent.com/img/a/AVvXsEgvKUzdPNb3FnsbO39oPV3yolOKCP0rwWyl_lmABVY_VUX8qZhTxQIgyGJWRXtjBz_uQx3Ola7WKna4ox7suZG2xn1-HTankPy3eoUpxFPGKxgd7odiTMXoGNanqffRgs3bm5n7078_Z2RJ4_iUZ_iLbpod__gIB4AcMGWnBxM7IcvDW_-b5Qrbb7D4=s16000" /></a></div><br /><div style="text-align: left;"><br /></div></div></div>I can only assume you've reached my article here because you too have run into this particular error message. Lots of searching lead me to lots of suggestions of what could be wrong including:<ul><li>There are no named instances for SSIS, check your Server Name</li><li>You may need to open port 135</li><li>Maybe it's sufficient SSIS server permissions</li><li>Is the RPC service started?</li><li>It could be a double hop authentication issue</li><li>Is your DCOM Config for Microsoft SQL Server Integration Services configured correctly?</li><li>Configure your firewall to allow MsDtsSrvr.exe</li><li>Try using the IP, maybe it's netbios</li></ul><p>As you can imagine, none of these were the solution to my problem. The root cause is that SSIS is using RPC, which has a large dynamic port range. This article describes what's going on with RPC, and has commands for identifying the existing RPC dynamic port range and how to set the range:</p><p><a href="https://docs.microsoft.com/en-us/troubleshoot/windows-server/networking/default-dynamic-port-range-tcpip-chang">https://docs.microsoft.com/en-us/troubleshoot/windows-server/networking/default-dynamic-port-range-tcpip-chang</a></p><p>For example this command will show you the existing port range:</p><p><span style="font-family: courier;">netsh int ipv4 show dynamicport tcp</span></p><p><br /></p><p>For what it's worth, here are a few articles I waded through before I found the solution:</p><p><a href="https://stackoverflow.com/questions/48241221/ssis-the-rpc-server-is-unavailable">https://stackoverflow.com/questions/48241221/ssis-the-rpc-server-is-unavailable</a></p><p><a href="https://social.msdn.microsoft.com/Forums/sqlserver/en-US/e1e7e568-9116-43de-beb1-6cf3a684582e/the-rpc-server-is-unavailable-tried-almost-everything?forum=sqlintegrationservices">https://social.msdn.microsoft.com/Forums/sqlserver/en-US/e1e7e568-9116-43de-beb1-6cf3a684582e/the-rpc-server-is-unavailable-tried-almost-everything?forum=sqlintegrationservices</a></p><p><a href="https://itluke.online/2017/11/08/solved-exception-from-hresult-0x800706ba/">https://itluke.online/2017/11/08/solved-exception-from-hresult-0x800706ba/</a></p><p><a href="https://www.mssqltips.com/sqlservertip/6236/connecting-to-integration-services-access-is-denied-in-sql-server-2016-or-2017/">https://www.mssqltips.com/sqlservertip/6236/connecting-to-integration-services-access-is-denied-in-sql-server-2016-or-2017/</a></p><p><br /></p>Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-33376186960707069772021-06-07T11:39:00.004-07:002021-06-07T11:42:03.701-07:00Ranger - Hive policy activation time delayed by more than ...<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYleFqsMkuuWflIxNLC9Jjb1Vw1r-OdyebzdCgE6nqMIu491BCoUDt52SXgnmnQ4VT9DojmsTxe59qmPZc878Os-GZF2o3u6C84AnSfx2ga7eJF3lenT2U48f6LIukP_x2saOTuxNkWnQ/s641/ranger.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="136" data-original-width="641" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYleFqsMkuuWflIxNLC9Jjb1Vw1r-OdyebzdCgE6nqMIu491BCoUDt52SXgnmnQ4VT9DojmsTxe59qmPZc878Os-GZF2o3u6C84AnSfx2ga7eJF3lenT2U48f6LIukP_x2saOTuxNkWnQ/s320/ranger.jpg" width="320" /></a></div><p>Just a quick blog here about an issue I had with HDP-3.1.4.0. I recently was setting up a new user with specific rights in Ranger for Hive access. After creating the new policy and attempting to validate it, I received an error message stating that the hive user does not have use privilege. This error was produced even though I had just created the policy specifically granting those privilege's.</p><p>Upon further review I noticed that the plugin was downloading the policy, but not applying it. You would find this information here in Ranger:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcIJJVfaohJlDu4tQ2PpvZHXN4hrsDBzBQq0Qwb3RxjmeGwQ0-X054-qQMinZxlP3TDFljtcov_-R2N2oAfRwZ3Ue1UG5SUQlGuSmvRyv50scbppr6zzikYShzGxPRp0i8CUXNbH_2bLg/" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="123" data-original-width="829" height="94" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcIJJVfaohJlDu4tQ2PpvZHXN4hrsDBzBQq0Qwb3RxjmeGwQ0-X054-qQMinZxlP3TDFljtcov_-R2N2oAfRwZ3Ue1UG5SUQlGuSmvRyv50scbppr6zzikYShzGxPRp0i8CUXNbH_2bLg/w640-h94/image.png" width="640" /></a></div><br />After much searching around, I could not find a specific article detailing this issue and the resolution.<p></p><p>What I did find was this article:</p><p><a href="https://issues.apache.org/jira/browse/RANGER-2348">https://issues.apache.org/jira/browse/RANGER-2348</a></p><p>Although I am not running interactive mode, comments in the link above did point me to what the ultimate issue was; "multiple versions of jersey libs inside the classpath."</p><p>What I found was that the ../hive/lib/* directories contained both of these:<br />jersey-client-1.19.jar<br />jersey-client-2.25.1.jar</p><p>The final resolution was to remove jersey-client-1.19.jar, which resolved the classpath conflict and allowed the Ranger policies to be applied without failing.</p><p>This seems to be a generic issue with Ranger/Hive in HDP 3.1.4, perhaps other versions as well.</p><p>I hope this helps someone googling around for:<br />hive user does not have use privilege<br />or, ranger hive "policy activation time delayed by more than"</p>Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-67719552022086543942021-05-03T08:09:00.001-07:002021-05-03T08:25:28.841-07:00Python - pyodbc and Batch Inserts to SQL Server (or pyodbc fast_executemany, not so fast...)<p> </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhknBw8SDJCxQU0l6K3865YlGj047Hy9w5TB0BYuCzc9EL4g4sdoHlBuWszVHZ2QuCXxd034ibJTyDd_aZ7LT1vcbuq-Bkn6NqUVIZIuW8K_xZ8fPHy9lbShH3i_-13NAx9DCAlVuGMUMQ/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="203" data-original-width="601" height="108" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhknBw8SDJCxQU0l6K3865YlGj047Hy9w5TB0BYuCzc9EL4g4sdoHlBuWszVHZ2QuCXxd034ibJTyDd_aZ7LT1vcbuq-Bkn6NqUVIZIuW8K_xZ8fPHy9lbShH3i_-13NAx9DCAlVuGMUMQ/" width="320" /></a></div>I recently had a project in which I needed to transfer a 60 GB SQLite database to SQL Server. After some research I found the sqlite3 and pyodbc modules, and set about scripting connections and insert statements. <p></p><p>The basic form of my script is to import the modules, setup the database connections, and iterate (via cursor) over the rows of the select statement creating insert statements and executing them. </p><p>The issue here is that this method results in single inserts being sent one at a time yielding less than satisfactory performance. Inserting 35m+ rows in this fashion takes ~5hrs on my system.</p><p>After some researching I found the general community suggesting solutions like the following:</p><p></p><ul style="text-align: left;"><li>executemany</li><li>fast_executemany<br /><a href="https://towardsdatascience.com/how-i-made-inserts-into-sql-server-100x-faster-with-pyodbc-5a0b5afdba5"><span style="font-size: x-small;">https://towardsdatascience.com/how-i-made-inserts-into-sql-server-100x-faster-with-pyodbc-5a0b5afdba5</span></a></li><li>dumping data to csv files and using bulk insert<br /><a href="https://github.com/mkleehammer/pyodbc/issues/619"><span style="font-size: x-small;">https://github.com/mkleehammer/pyodbc/issues/619</span></a></li><li>using pandas and sqlalchemy<br /><a href="https://github.com/mkleehammer/pyodbc/issues/812"><span style="font-size: x-small;">https://github.com/mkleehammer/pyodbc/issues/812</span></a></li></ul><p></p><p>In addition, based on my prior DBA experience, my initial thought was to create a string with BEGIN TRANSACTION, concatenate a batch of INSERT statements, end the string with a COMMIT TRANSACTION, and finally pass the batches of transactions as strings to pyodbc.execute.</p><p>I did some testing, please find my results below.</p><p>For this experiment I used a production dataset consisting of a single SQLite table with 18,050,355 rows. The table has four columns with datatypes of integer, integer, text, and real. The real column was converted to an integer as it was improperly defined in the SQLite database and was validated to be an integer.</p><h3 style="text-align: left;">1 by 1 inserts (~80 minutes)</h3><p>The 1 by 1 inserts were executed by performing inserts as such:</p><p><span style="font-family: courier;">select_stmt="SELECT col0, col1,col2,CAST(col3 AS INT) FROM table"<br />for row in sqlite3Cursor.execute(select_stmt):<br /> insert_stmt = f"INSERT INTO table(col0,col1,col2,col3) VALUES ({row[0]},{row[1]},'{row[2]}',{row[3]})"<br /> pyodbcCursor.execute(insert_stmt)</span></p><p>Using SQL Server Profiler I could see the 1 by 1 inserts happening:</p><p></p><div class="separator" style="clear: both; text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDoOHI6I0nRcZUiOPqHOuAlb9xnj0-cajd-MUzeWVbzDg6Z5pe7XoME9h9YH9zfGc3l6kzE_ElCBrB6YClc80EBIk9SubwunL6-vTE_pqTlMWm_f1A8c2sdygxxL7N93P9SpWMt8OYDnk/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="185" data-original-width="406" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDoOHI6I0nRcZUiOPqHOuAlb9xnj0-cajd-MUzeWVbzDg6Z5pe7XoME9h9YH9zfGc3l6kzE_ElCBrB6YClc80EBIk9SubwunL6-vTE_pqTlMWm_f1A8c2sdygxxL7N93P9SpWMt8OYDnk/s16000/image.png" /></a></div><h3>executemany (~95 minutes)</h3><div>For this test I modified the insert statement into a prepared statement and fed in the insert statement and the select from SQLite as the input parameters for the executemany function:</div><p></p><div><span style="font-family: courier;">insert_stmt = f"INSERT INTO table(</span><span style="font-family: courier;">col0,col1,col2,col3</span><span style="font-family: courier;">) VALUES (?,?,?,?)"</span></div><div><span style="font-family: courier;">pyodbcCursor.executemany(insert_stmt,sqlite3Cursor.execute(select_stmt))</span></div><div><span style="font-family: courier; font-size: x-small;"><br /></span></div><div>Via profiler we can see that we're now running prepared statements:</div><div><div class="separator" style="clear: both; text-align: left;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGFPFmwG7a_iru4nIkL80E5WEOQ5-oX6WVvmYVlYVLfZhZWGP4Wlk6lBGW6T1aBFc6TGmf-sLdfA1boind2_iILSD4zQYPkf5HR8RXc2AxcS5AAE9r_qXmxbVLtePy21d4xOz-P8TwFcs/" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="165" data-original-width="631" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGFPFmwG7a_iru4nIkL80E5WEOQ5-oX6WVvmYVlYVLfZhZWGP4Wlk6lBGW6T1aBFc6TGmf-sLdfA1boind2_iILSD4zQYPkf5HR8RXc2AxcS5AAE9r_qXmxbVLtePy21d4xOz-P8TwFcs/s16000/image.png" /></a></div><br />Unfortunately, this took longer than the 1 by 1 inserts.</div><p></p><h3>fast_executemany (~75 minutes)</h3>For this test I simply added a line of code:<p></p><p><span style="font-family: courier;">pyodbcCursor.fast_executemany = True</span></p><p>On the profiler side of things this looked exactly like executemany, with repetitive executions of single prepared statements.</p><p>This did run quite a bit faster then executemany, however it only ran 5 minutes faster than the 1 by 1 inserts.</p><h3 style="text-align: left;">SQL Server Transaction (~20 minutes)</h3><div>For more info on SQL Server Transactions see:<br /><a href="https://docs.microsoft.com/en-us/sql/t-sql/language-elements/begin-transaction-transact-sql?view=sql-server-ver15"><span style="font-size: x-small;">https://docs.microsoft.com/en-us/sql/t-sql/language-elements/begin-transaction-transact-sql?view=sql-server-ver15</span></a></div><p>For the code modification on this test, I reverted back to a row by row cursor loop which builds the batches of 100 inserts per transaction. It looks like this:</p><p><span style="font-family: courier;">counts = 0<br />select_stmt="SELECT col0, col1,col2,CAST(col3 AS INT) FROM table"<br />insert_stmt="BEGIN TRANSACTION \r\n"<br />for row in sqlite3Cursor.execute(select_stmt):<br /><span> </span>counts += 1<br /><span> </span>insert_stmt = f"INSERT INTO table(col0,col1,col2,col3) VALUES ({row[0]},{row[1]},'{row[2]}',{row[3]}) \r\n"<br /> #The modulus is the batch size<br /><span> </span>if counts % 100 == 0: <br /> insert_stmt+="COMMIT TRANSACTION"<br /> pyodbcCursor.execute(insert_stmt)<br /> pyodbcCursor.commit()<br /> insert_stmt="BEGIN TRANSACTION \r\n"<br />#get the last batch<br />if insert_stmt!="BEGIN TRANSACTION \r\n":<br /> insert_stmt+="COMMIT TRANSACTION"<br /> pyodbcCursor.execute(insert_stmt)<br /> pyodbcCursor.commit()</span></p>From profiler we can now see batches happening, 100 inserts at a time!<br />This also completes in 20 minutes which is 3 to 4 times faster than the next best test.<div><br /></div><div>For what it's worth I was able to get this down to 12 mins with a batch size of 400 rows. Batches larger than ~450 rows started dropping inserts. Troubleshooting resulted in the determination that I was running into this issue:<br /><a href="https://docs.microsoft.com/en-us/troubleshoot/sql/connect/fail-run-large-batch-sql-statements">https://docs.microsoft.com/en-us/troubleshoot/sql/connect/fail-run-large-batch-sql-statements</a><br /><br />TL;DR; I had to add SET NOCOUNT ON; in order to address this, however there was not a significant speed improvement with batch sizes larger than 400, for me.</div><h3 style="text-align: left;">Counter Log Review</h3><div>I also took a performance counter log during testing. The tests were performed in the order outlined above: 1 by 1, executemany, fast_executemany, and SQL Server Transactions. You can see in the graphs below, 4 very distinct groups of executions. The last test, SQL Server Transactions 100 inserts per batch ended at 1:30. There was another failed test of 1000 transactions per batch between 1:30 and the end of the graphs below.</div><div><br /></div><div><p class="MsoNormal">Red is the SQL Server Transaction log drive (I’m in SIMPLE recovery mode).<br />Orange is the SQL Server Disk for data files.</p></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-JpAzq5baE5OcrrIOpDCeM_kLILPxmextaeoUe0tY7UmBw6028LOG5pyO5JVsir7haDt7zBHkgi_OPo8oQfHJh7wLrxPcIF5N87iL1u2gYdC4In1TcT3bYX6j9oVQQUTXQJBnR7JKXtw/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="556" data-original-width="594" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-JpAzq5baE5OcrrIOpDCeM_kLILPxmextaeoUe0tY7UmBw6028LOG5pyO5JVsir7haDt7zBHkgi_OPo8oQfHJh7wLrxPcIF5N87iL1u2gYdC4In1TcT3bYX6j9oVQQUTXQJBnR7JKXtw/s16000/image.png" /></a></div><p class="MsoNormal">Memory looked fine, it did not get overwhelmed.</p>
<p class="MsoNormal"><o:p></o:p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6K-uY21WxMCu2A29KfskuZpxaHtmAOZtZMRAp-ziMbnoUxWmiCmC_x5fFqVo5dwlHyw9MrFujYcISmJFaQXtt-tSQJL67iK7ihx2-LGyvkoRBg2_xzwYgbXXQfmM9Gll7e2lSng9_AI8/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="558" data-original-width="598" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6K-uY21WxMCu2A29KfskuZpxaHtmAOZtZMRAp-ziMbnoUxWmiCmC_x5fFqVo5dwlHyw9MrFujYcISmJFaQXtt-tSQJL67iK7ihx2-LGyvkoRBg2_xzwYgbXXQfmM9Gll7e2lSng9_AI8/s16000/image.png" /></a></div><p></p><p class="MsoNormal"><o:p></o:p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4-ephd3tVm5YlU31c7lcHt1u9J37DQQVV2sicnq_zWL8ShUNT9MU0GgThXfyS8wubrlz432r6NFk-sQ7voAWtC-Gfmf_KuS5lJHc4m6pQ25KwP4-WMAaMgGq35NxG3bin1o1asdV2Mxg/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="564" data-original-width="602" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4-ephd3tVm5YlU31c7lcHt1u9J37DQQVV2sicnq_zWL8ShUNT9MU0GgThXfyS8wubrlz432r6NFk-sQ7voAWtC-Gfmf_KuS5lJHc4m6pQ25KwP4-WMAaMgGq35NxG3bin1o1asdV2Mxg/s16000/image.png" /></a></div><p></p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSnDcUIcG44ksB3btdBsNx2Oga1OSuUWyWfbWGvBjQxYLpbAHIOXBNNuz4dc7LW9VqBZCJ4beDz1iwKruVyb6RlWXP0wfGWm3idilzahh-nGpyKKXWHwxZdxZSl8EMZ6UgNQ6YR6sf0aM/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="553" data-original-width="624" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSnDcUIcG44ksB3btdBsNx2Oga1OSuUWyWfbWGvBjQxYLpbAHIOXBNNuz4dc7LW9VqBZCJ4beDz1iwKruVyb6RlWXP0wfGWm3idilzahh-nGpyKKXWHwxZdxZSl8EMZ6UgNQ6YR6sf0aM/s16000/image.png" /></a></div><br /><br /></div><div><br /></div>Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com11tag:blogger.com,1999:blog-6818257984018916419.post-4658720031794654662020-03-09T08:00:00.001-07:002021-04-30T08:11:20.612-07:00Sqoop - Scheduling and Security<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHmu5VWHaOQULpssyPabUD81jVTYFHusIDgJsGnqVpTmqqCEDW8M0vKubX9pYTJGOFMFdDTY4HE9N92VDyHI0hzJc_neuf6KgUB4yrWCjthy_7Zm9_TFunBVbDkRN-gSSGol4cU5bTw10/s1600/sqoop-logo.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="46" data-original-width="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHmu5VWHaOQULpssyPabUD81jVTYFHusIDgJsGnqVpTmqqCEDW8M0vKubX9pYTJGOFMFdDTY4HE9N92VDyHI0hzJc_neuf6KgUB4yrWCjthy_7Zm9_TFunBVbDkRN-gSSGol4cU5bTw10/s1600/sqoop-logo.png" /></a></div>
<br />
In previous articles, I've walk through using Sqoop to import data to HDFS. I've also detailed how to perform full and incremental imports to Hive external and Hive managed tables.<br />
<br />
In this article I'm going to show you how to automate execution of Sqoop jobs via Cron.<br />
<br />
However, before we get to scheduling we need to address security. In prior examples I've used -P to prompt the user for login credentials interactively. With a scheduled job, this isn't going to work. Fortunately Sqoop provides us with the "password-alias" arg which allows us to pass in passwords stored in a protected keystore.<br />
Here are a couple of helpful articles related to using this functionality:<br />
<br />
<ul>
<li><span style="font-size: x-small;"><a href="https://community.cloudera.com/t5/Support-Questions/Is-it-possible-to-connect-Sql-Server-via-Sqoop-or-Nifi-with/td-p/243100">https://community.cloudera.com/t5/Support-Questions/Is-it-possible-to-connect-Sql-Server-via-Sqoop-or-Nifi-with/td-p/243100</a></span></li>
<li><span style="font-size: x-small;"><a href="http://www.hadoopadmin.co.in/bigdata/encrypt-password-used-by-sqoop-to-import-or-export-data-from-database/">http://www.hadoopadmin.co.in/bigdata/encrypt-password-used-by-sqoop-to-import-or-export-data-from-database/</a></span> </li>
</ul>
<br />
The tl;dr is:<br />
<ol>
<li>Create a credential in your HDFS home directory:<br />hadoop credential create My.password -provider jceks://hdfs/user/MyPassword.jceks</li>
<li>Reference this password in your Sqoop import command:<br />sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/MyPassword.jceks ... --username '[Login]' --password-alias My.password ...</li>
</ol>
Now that we have the ability to pass in our credentials automatically we can discuss how to automate execution of Sqoop jobs.<br />
<br />
The first way is very simple and should be familiar to any Linux user: Cron / Crontab. There are ton of articles out there (<a href="https://linuxize.com/post/scheduling-cron-jobs-with-crontab/">like this one</a>) that explain how to use Cron to schedule jobs in Linux. For our purposes it's as simple as creating a new file like /home/[User]/MyFirstSqoopJob.sh and editing the file to look something like this:<br />
<br />
sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/MyPassword.jceks --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' --password-alias My.password --table [TableName] -m 1 --table [TableName] --target-dir [HDFS Location] --delete-target-dir<br />
<br />
You could also create a sqoop job, for an incremental import, and edit the file to look like:<br />
<br />
sqoop job -exec [JobName]<br />
<br />
Next configure Cron to execute MyFirstSqoopJob.sh:<br />
<br />
Crontab -e<br />
0 8 * * * /home/[User]/MyFirstSqoopJob.sh<br />
<br />
The above schedule will run daily at 8AM.<br />
<br />
Another way to schedule sqoop jobs is via Oozie.Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-15869769210331375342020-03-02T08:00:00.000-08:002020-03-02T08:00:00.141-08:00Sqoop - Incremental Imports<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHmu5VWHaOQULpssyPabUD81jVTYFHusIDgJsGnqVpTmqqCEDW8M0vKubX9pYTJGOFMFdDTY4HE9N92VDyHI0hzJc_neuf6KgUB4yrWCjthy_7Zm9_TFunBVbDkRN-gSSGol4cU5bTw10/s1600/sqoop-logo.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="46" data-original-width="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHmu5VWHaOQULpssyPabUD81jVTYFHusIDgJsGnqVpTmqqCEDW8M0vKubX9pYTJGOFMFdDTY4HE9N92VDyHI0hzJc_neuf6KgUB4yrWCjthy_7Zm9_TFunBVbDkRN-gSSGol4cU5bTw10/s1600/sqoop-logo.png" /></a></div>
In my last two blog posts I walked through how to use Sqoop to perform full imports. Nightly full imports with overwrite has it's place for small tables like dimension tables. However, in real-world scenarios you're also going to want a way to import only the delta values since the last time an import was run. Sqoop offers two ways to perform incremental imports: append and lastmodified.<br />
<br />
Both incremental imports can be run manually or created as job using the "sqoop job" command. When running incremental imports manually from the command line the "--last-value" arg is used to specify the reference value for the check-column. Alternately sqoop jobs track the "check-column" in the job and the value of the check-column is used for subsequent job runs as the where predicate in the SQL statement. I.E. select columns from table where check-column > (last-max-check-column-value).<br />
<br />
When using sqoop jobs, the following command args are helpful:<br />
<br />
<ul>
<li>sqoop job --create</li>
<li>sqoop job -list</li>
<li>sqoop job -show [JobName]</li>
<li>sqoop job -exec [JobName]</li>
<li>sqoop job -delete [JobName]</li>
</ul>
<div>
<br /></div>
<h3>
Incremental Import - append</h3>
<div>
<br />
The "--incremental append" arg can be passed to the sqoop import command to run append only incremental imports. At it's most simple this type of sqoop incremental import is meant to reference an ever increasing row id (like an Oracle sequence or a Microsoft SQL Server identity column). Here's an example sqoop import command using incremental append that can be run manually:<br />
<br />
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --target-dir [HDFS Location for Hive External Table] --incremental append --check-column [col] --last-value [val]</div>
<div>
<br />
The output includes some important pieces of information detailing what sqoop used for the query. I.E. SELECT * from table where ID > [Lower bound value]. It also includes what the max ID was at the time of execution (the Upper bound value), and even what value to use for a subsequent run (--last-value 77) to continue appending where you left off:<br />
...<br />
Lower bound value: 46<br />
Upper bound value: 77<br />
...<br />
Incremental import complete! To run another incremental import of all data following this import, supply the following arguments:<br />
--incremental append<br />
--check-column ID<br />
--last-value 77<br />
(Consider saving this with 'sqoop job --create')<br />
<br />
The very last line suggests creating a sqoop job instead of resorting to manual entry of the --last-value, so lets make one. The command is very similar, however instead of sqoop import, we use sqoop job:<br />
<br />
sqoop job --create [JobName] -- import --connect jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName] --username [Login] -P -m 1 --table [TableName] --target-dir [HDFS Location for Hive External Table] --incremental append --check-column [col]<br />
<br />
During the creation of this job you'll be prompted for the password to the RDBMS.<br />
<br />
This command creates the job but does not run it. To execute the new job use:<br />
sqoop job -exec [JobName]<br />
<br />
During the execution of this job you'll be prompted for the password to the RDBMS.<br />
<br />
We can now inspect the sqoop job to learn a bit more about how things are wired up:<br />
<br />
sqoop job -show [JobName]<br />
<br />
There's a bunch of useful information in here like the following:<br />
<br />
<ul>
<li>hdfs.target.dir - Where sqoop is putting the data in HDFS</li>
<li>mapreduce.num.mappers - The number of map reduce jobs (the -m arg)</li>
<li>hive.import - is this a hive import or an HDFS import</li>
<li>hdfs.delete-target.dir - True / False (for HDFS overwrites)</li>
<li>incremental.last.value - This is the upper bound from the previous run</li>
<li>incremental.col - The check-column</li>
<li>hive.overwrite.table - True / False (for Hive overwrites)</li>
<li>incremental.mode = AppendRows or DateLastModified</li>
<li>db.table - source table name</li>
<li>db.connect.string - RDBMS connection string</li>
</ul>
<div>
A quick way to get the previous upper bound is to pipe the output to grep:</div>
<div>
sqoop job -show [JobName] | grep incremental.last.value</div>
<div>
<br /></div>
It is important to note that I've been importing data via sqoop to HDFS, not Hive. We can, however, modify the command slightly to achieve append only incremental imports to Hive managed tables:<br />
<br />
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --incremental append --check-column [col] --hive-import --hive-database [DBName] --hive-table [TableName] --last-value [val]<br />
<br />
...or create a job to manage the last-value for us:<br />
<br />
sqoop job --create [JobName] -- import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --incremental append --check-column [col] --hive-import --hive-database [DBName] --hive-table [TableName]<br />
<br /></div>
<h3>
Incremental Import - lastmodified</h3>
<div>
<br /></div>
<div>
The "--incremental lastmodified" arg can be passed to the sqoop import command to run lastmodified incremental imports. What this means is that we're going to use some sort of datetime column for our comparison with the last sqoop execution and pull in any rows that have a more recent timestamp than the last execution. This allows us to incorporate updates vs. append which only accommodated inserts from the source.<br />
<br />
The sqoop import command for lastmodified is almost identical to the command used for append:<br />
<br />
<div>
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --target-dir [HDFS Location for Hive External Table] --incremental lastmodified --check-column [col] --last-value '[TIMESTAMP]'<br />
<br />
The two modifications I made were to change append to lastmodified and I updated my --check-column to a "TIMESTAMP(6)" column in my Oracle table. Sqoop expects the last-value to be in the format: 'YYYY-MM-DD HH24:MI:SS.FF'.</div>
<br />
Again, the output includes some important pieces of information detailing what sqoop used for the query including the lower bound and upper bound values (only upper bound is list on the first execution). The query run by sqoop ends up looking like SELECT * from table where [col] > [Lower bound value]. The Upper bound value gets set to the current time at the time of execution.<br />
...<br />
Incremental import based on column [col]<br />
Lower bound value: TO_TIMESTAMP('[TIMESTAMP]')<br />
Upper bound value: TO_TIMESTAMP('[TIMESTAMP]')<br />
...<br />
Incremental import complete! To run another incremental import of all data following this import, supply the following arguments:<br />
--incremental lastmodified<br />
--check-column [col]<br />
--last-value [YYYY-MM-DD HH24:MI:SS.FF]<br />
(Consider saving this with 'sqoop job --create')<br />
<br />
You may have also noticed the following:<br />
<br />
Time zone has been set to GMT<br />
<br />
In addition you may have noticed that the last-value doesn't exactly match. What is actually listed as the last-value is the time at execution in GMT. In order to address this we need to provide the sqoop import command with a Hadoop timezone property (sqoop import -D oracle.sessionTimeZone):<br />
<br />
sqoop import -D oracle.sessionTimeZone=America/Denver --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --target-dir [HDFS Location for Hive External Table] --incremental lastmodified --check-column [col] --last-value '[TIMESTAMP]'<br />
<br />
In order to turn this into a sqoop job that auto manages the last-value, the command is very similar to what we did for the append job:<br />
<br />
sqoop job -D oracle.sessionTimeZone=America/Denver --create [JobName] -- import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --target-dir [HDFS Location for Hive External Table] --incremental lastmodified --check-column [col]<br />
<br />
We can then execute the job with the following command:<br />
sqoop job -exec [JobName]<br />
<br />
From here you can try updating the data in your RDBMS to have a date > the last Upper bound value. and re-running the job to watch the update flow through. If you do this and make no changes to anything on the HDFS side, you'll receive the following on the second run:<br />
<br />
<br />
<ul>
<li>ERROR tool.ImportTool: Import failed: --merge-key or --append is required when using --incremental lastmodified and the output directory exists.</li>
</ul>
<br />
<br />
This Error is telling us that sqoop with an incremental import to an HDFS directory that already exists requires an arg to either append the data (--append), or to merge the data (--merge-key). You may also remember that we used --delete-target-dir in a previous blog to run daily imports with overwrite. Unfortunately, --delete-target-dir can not be used with incremental imports. So, from here you can do a few things: provide a --merge-key (a unique column from the source database to merge the data), specify --append (which will result in duplicate rows), or put the sqoop job in some sort of workflow which removes the target directory prior to executing the sqoop job. If you're lucky enough to have a unique primary key and a valid lastmodified timestamp column, the easiest solution is to simply add the --merge-key:<br />
<br />
sqoop import -D oracle.sessionTimeZone=America/Denver --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --target-dir [HDFS Location for Hive External Table] --incremental lastmodified --check-column [col] --last-value '[TIMESTAMP]' --merge-key [col] --last-value '[TIMESTAMP]'<br />
<br />
...or create a job:<br />
<br />
sqoop job -D oracle.sessionTimeZone=America/Denver --create [JobName] -- import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --target-dir [HDFS Location for Hive External Table] --incremental lastmodified --check-column [col] --merge-key [col]</div>
<div>
<br />
You may now be wondering if you can just take this all directly to a hive managed table. If you try you'll receive the following:<br />
--incremental lastmodified option for hive imports is not supported. Please remove the parameter --incremental lastmodified.<br />
<br />
Getting everything to HDFS (or a hive external table) is pretty good though. From here a simple Hive command can load this (with overwrite or with merge) into a Hive managed table.<br />
<br />
<div>
With sqoop incremental imports you may, in some scenarios, end up with multiple/duplicate rows in HDFS or Hive. You will have one row for the original import and a second for the newer updated data. There are several methods to address the merge including writing SQL queries to Hive with a Max function on the datetime column, using a sqoop merge command to merge hdfs data, or using a combination of views and staging tables to name a few.<br />
<br />
I think my favorite way to accomplish the merge is to have a Hive managed table and a hive external table. The managed table contains the final merged data. The external table is where the lastmodified data lands in hadoop. Once the updates are loaded into the Hive external table a Hive merge command is then executed to update the data in the hive managed table and then the external table is essentially truncated via removal of the files in the related hdfs directory. Suffice it to say this is all a topic for another discussion. </div>
</div>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-80695247939267341152020-02-24T08:00:00.000-08:002020-02-24T08:00:12.559-08:00Sqoop - Importing Data into Hadoop (Hive)<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgpnDh69_MIi7a6Et2emHly1iSL1vr5cUhm5_CEOsBmPFL7O4HAX_UjcbwOo0t8-PYBOuseDBG1ANBoWLqj3dbnMQ28IxeGuJYmHpwc7CzlF0dngU4GIiz6DOFN0ONowT9irs1NHPPXXA/s1600/sqoop-logo.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="46" data-original-width="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgpnDh69_MIi7a6Et2emHly1iSL1vr5cUhm5_CEOsBmPFL7O4HAX_UjcbwOo0t8-PYBOuseDBG1ANBoWLqj3dbnMQ28IxeGuJYmHpwc7CzlF0dngU4GIiz6DOFN0ONowT9irs1NHPPXXA/s1600/sqoop-logo.png" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjv46gE3mpaxfzMXrDtK40zP_r0AvOwyOEoMPvNZ78M0mR_kZp6IFfg_cBQQQvG9hXcwWqU0mwpp_wGSMX1ZljCQ2mEb50LWtwUROto6H1ybB0SAOJdfyXJL_NNHFTC7Xo7TNZChuEUg5M/s1600/transparentHive.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="360" data-original-width="765" height="150" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjv46gE3mpaxfzMXrDtK40zP_r0AvOwyOEoMPvNZ78M0mR_kZp6IFfg_cBQQQvG9hXcwWqU0mwpp_wGSMX1ZljCQ2mEb50LWtwUROto6H1ybB0SAOJdfyXJL_NNHFTC7Xo7TNZChuEUg5M/s320/transparentHive.jpg" width="320" /></a></div>
<br />
<br />
In my previous article I walked through using Sqoop to import data to Hadoop (HDFS). In this article, I'll walk through using Sqoop to import data into Hadoop (Hive).<br />
<br />
<h3>
Import to Hive Managed table</h3>
<div>
<br /></div>
There are only a few args that need to be supplied in order to instruct the sqoop import command to import data directly to Hive:<br />
<br />
<ul>
<li>--hive-import </li>
<li>--create-hive-table </li>
<li>--hive-database</li>
<li>--hive-overwrite</li>
</ul>
<div>
<br /></div>
<div>
The following is an example command that will connect to Oracle and import data directly into Hive:</div>
<div>
<br /></div>
<div>
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --hive-import --create-hive-table --hive-database [DBName]</div>
<br />
You'll need to ensure the Hive database does exist prior to running the import or else you will get an error:
<br />
<ul>
<li>Error: Error while compiling statement: FAILED: SemanticException [Error 10072]: Database does not exist:</li>
</ul>
<div>
Sqoop completes the import task by running MapReduce jobs importing the data to HDFS, and then running Hive commands (CREATE TABLE / LOAD DATA INPATH) to move the data to Hive. The default HDFS location is: /user/[login]/[TABLENAME]. If you have any issues during the import you may need to remove the HDFS directory prior to re-running, or else you will get an Error:</div>
<br />
<br />
<ul>
<li> ERROR tool.ImportTool: Import failed: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory ... already exists</li>
</ul>
<div>
Sqoop does clean up the temporary HDFS directory, and following a successful import, the HDFS directory: /user/[login]/[TABLENAME] should no longer exist. Using Sqoop to import directly to Hive creates a Hive "managed" table. Running describe on the Sqoop created Hive table will provide you with the HDFS location where the data is located.</div>
<div>
<br />
Replacing --create-hive-table with --hive-overwrite will overwrite the existing Hive table:<br />
<br />
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P -m 1 --table [TableName] --hive-import --hive-database [DBName] --hive-overwrite<br />
<br />
This task is completed the same way by Sqoop as the prior import, the only difference being that the Hive LOAD DATA INPATH command includes OVERWRITE INTO TABLE.<br />
<br />
There are some other args that can be used like --target-dir which allows you to control the temp directory where Sqoop stages the data in HDFS. Again, just make sure the directory doesn't already exist.<br />
<br />
<h3>
Import to Hive External table</h3>
<br />
It is important to note that you can accomplish the goal of importing data to a Hive External table without using any of the "hive" sqoop import args that we just went through. This can be useful if you'd like the data to live in HDFS and be accessible by Hive AND Spark. As of the time of this writing, Spark is unable to read Hive Managed tables but Spark can read data from HDFS.<br />
<br />
In order to import to a Hive external table, the first step is creating the database table. It looks something like this:<br />
<br />
CREATE EXTERNAL TABLE [TABLENAME] (col_name data_type...)<br />
ROW FORMAT DELIMITED<br />
FIELDS TERMINATED BY ','<br />
LINES TERMINATED BY '\n'<br />
STORED AS TEXTFILE<br />
<br />
Running describe on the table will provide you with the location of the directory in HDFS where Hive is looking for the related data. The default directory will look like this:<br />
<br />
hdfs://[ClusterName]/warehouse/tablespace/external/hive/[DBName].db/[TableName]<br />
<br />
From here you can run a standard Sqoop to HDFS import utilizing the target-dir. Because running with --target-dir will generate an error if the target-dir exists, we'll also add: --delete-target-dir:<br />
<br />
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P --table [TableName] -m 1 --table [TableName] --target-dir [HDFS Location] --delete-target-dir<br />
<br />
<br /></div>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-47047169945882408342020-02-17T08:00:00.000-08:002020-02-18T08:34:29.248-08:00Sqoop - Importing Data into Hadoop (HDFS)<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgpnDh69_MIi7a6Et2emHly1iSL1vr5cUhm5_CEOsBmPFL7O4HAX_UjcbwOo0t8-PYBOuseDBG1ANBoWLqj3dbnMQ28IxeGuJYmHpwc7CzlF0dngU4GIiz6DOFN0ONowT9irs1NHPPXXA/s1600/sqoop-logo.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="46" data-original-width="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgpnDh69_MIi7a6Et2emHly1iSL1vr5cUhm5_CEOsBmPFL7O4HAX_UjcbwOo0t8-PYBOuseDBG1ANBoWLqj3dbnMQ28IxeGuJYmHpwc7CzlF0dngU4GIiz6DOFN0ONowT9irs1NHPPXXA/s1600/sqoop-logo.png" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-CvUbuUcJHQB9_61vAZABSiZ4JkJgHbvtN5X1cr-lHeONaVshBV9vjQ1yx5MGreXX8_xpwHXqqhh35JvQNPHXrstyU-nuVPjPJTtBGCn3yPar-M33RBLpleDzsW3syWh9UQwmvfUaBac/s1600/hdfs-logo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="118" data-original-width="211" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-CvUbuUcJHQB9_61vAZABSiZ4JkJgHbvtN5X1cr-lHeONaVshBV9vjQ1yx5MGreXX8_xpwHXqqhh35JvQNPHXrstyU-nuVPjPJTtBGCn3yPar-M33RBLpleDzsW3syWh9UQwmvfUaBac/s1600/hdfs-logo.jpg" /></a></div>
<br />
In this article, I'll walk through using Sqoop to import data to Hadoop (HDFS).<br />
<br />
"<a href="http://sqoop.apache.org/">Apache Sqoop</a>(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases."<br />
<br />
<h3>
</h3>
<h3>
Validate Connectivity</h3>
<div>
<br /></div>
In my environment I'm usually running Sqoop from a linux shell and pulling data into Hadoop from a relational database management system (RDBMS) like Oracle or Microsoft SQL Server. Before getting started with a new Sqoop project, I like to validate that I can connect to the RDBMS. A good way to test connectivity is with Curl:<br />
<br />
curl telnet://[Hostname:Port] -v<br />
If you connect successfully you should see a message like:<br />
* Connected to...<br />
If you fail to connect the message will be:<br />
* Could not resolve host...<br />
<br />
Once you've established that you can connect we can perform some preliminary Sqoop Actions.<br />
<br />
<h4>
</h4>
<h4>
Oracle:</h4>
This will validate that you can connect and run a simple query:<br />
sqoop eval --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P --query "select 1 from dual"<br />
<i>-P indicates that you will be prompted for your password.</i><br />
<i><br /></i>
Assuming you have the necessary permissions, this one will list the databases, or schemas, in your Oracle environment:<br />
sqoop list-databases --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P<br />
<br />
Again, assuming you have the necessary permissions, this one will list the tables, in your default schema in your Oracle environment:<br />
sqoop list-tables --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P<br />
<br />
<h4>
</h4>
<h4>
Microsoft SQL Server:</h4>
This will validate that you can connect and run a simple query:<br />
sqoop eval --connect 'jdbc:sqlserver://[Hostname];instanceName=[Instance];database=[DBName]' --query "select 1" --username '[Login]' -P<br />
<br />
Assuming you have the necessary permissions, this one will list the tables in your default database:<br />
sqoop list-tables --connect 'jdbc:sqlserver://[Hostname] --username '[Login]' -P<br />
<br />
(There is no 'list-databases' equivalent in Sqoop for MS SQL Server)<br />
<br />
<h3>
</h3>
<h3>
Import to HDFS</h3>
<div>
<br /></div>
OK so we've validated we can connect to a RDBMS host and have used Sqoop to further validate general access to the database. Next, we'll run a manual one time data import to HDFS.<br />
<br />
All we need to do is use the "sqoop import" command and pass in a table name:<br />
<br />
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P --table [TableName]<br />
<br />
Sqoop handles all the datatype conversions and drops your table into HDFS as a CSV file. By default Sqoop is going to want to attempt to split up the data coming from your RDBMS based on the primary key so that it can run the import as multiple processes in parallel via MapReduce. If you don't have a primary key or have a textual index column, you may receive errors like the following:<br />
<br />
<br />
<ul>
<li>ERROR tool.ImportTool: Import failed: java.io.IOException: Generating splits for a textual index column allowed only in case of "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" property passed as a parameter</li>
<li>ERROR tool.ImportTool: Import failed: No primary key could be found for table [TableName]. Please specify one with --split-by or perform a sequential import with '-m 1'.</li>
</ul>
<br />
As stated in the error message, this can be addressed by adding '-m 1' to the import command. This tells sqoop <u>not</u> to attempt to run the import in parallel and to run only a single process:<br />
<br />
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P --table [TableName] -m 1<br />
<br />
Another potential issue when using Oracle is that you need to UPPER() the table name. If you don't you'll receive this error:<br />
<br />
<ul>
<li>ERROR tool.ImportTool: Import failed: There is no column found in the target table [TableName]. Please ensure that your table name is correct.</li>
</ul>
<br />
You may receive additional errors if there are insufficient rights to the HDFS directory, so make sure the permissions are correct if you receive those errors. The default HDFS location is /user/[Login]/<br />
<br />
You can validate your import with the following HDFS command:<br />
hdfs dfs -ls /user/[login]/[TABLENAME]<br />
<br />
Here you may notice a file ending in .deflate. This is because, as mentioned earlier, Sqoop uses MapReduce to execute the import process. Therefore, the files that get written will use the MapReduce defaults. In the case where you get a .deflate file, this is because MapReduce is configured by default to compress the output. The number of files here will be equal to the number of MapReduce jobs that were run to perform the import. So if you use '-m 1' you'll have 1 file. In my test case I ended up with 4 files. To read a .deflate file (part-m-00000.deflate for example) from HDFS directly use the following:<br />
<br />
hdfs dfs -text /user/[login]/[TABLENAME]/part-m-00000.deflate<br />
<br />
A couple of other args you can pass into the sqoop import command are a where clause and a target directory:<br />
<br />
sqoop import --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P --table [TableName] -m 1 --table [TableName] --where "[Predicate]" --target-dir [HDFS Location]<br />
<br />
When running with --target-dir, make sure the target-dir does <u>not</u> exist or you will get an error.<br />
<br />
A really neat thing that Sqoop can do, is grab the entire database or schema (depending on which RDBMS you're using). The only caveat is that every table must have a primary key. Here's what that command looks like:<br />
<br />
sqoop import-all-tables --connect 'jdbc:oracle:thin:@//[Hostname:Port]/[ServiceName]' --username '[Login]' -P --table [TableName] -m 1<br />
<br />
You can use the "exclude-tables" argument to tell Sqoop to ignore tables. For example when running against MS SQL you'll want to: --exclude-tables sysdiagrams.<br />
<br />
<br />Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-77067673847271203672018-10-11T12:54:00.002-07:002018-10-11T12:54:55.314-07:00Oracle CLOB to Solr via the Data Import Handler<br />
<div style="text-align: center;">
<img alt="Image result for clobber time meme" src="https://ifanboy.com/wp-content/uploads/2013/03/TheThing.jpg" /></div>
<div style="text-align: center;">
<i><span style="font-size: xx-small;">no copyright infringement is intended</span></i></div>
<div style="text-align: center;">
<br /></div>
<br />
In this example I build off of the <a href="http://jonmorisissqlblog.blogspot.com/2018/06/solr-oracle-timestamp-for-use-with.html">previous example</a>, by adding an Oracle CLOB field.<br />
<br />
All in all, this is pretty simple and requires just a few updates:<br />
<ol>
<li>Update the Oracle table to have a CLOB column</li>
<li>Update the collections data config file to use the ClobTransformer</li>
<li>Update the managed-schema to use a TextField data type</li>
</ol>
<h3>
</h3>
<h3>
Update the Oracle table to have a CLOB column</h3>
<div>
<span style="color: blue; font-family: "courier new";">CREATE</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">TABLE</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">"[schema]"</span><span style="color: silver; font-family: "courier new";">.</span><span style="color: maroon; font-family: "courier new";">"SOLR_TEST"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">(</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">"ID"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">NUMBER</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">generated</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">always</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">AS</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">identity</span><span style="color: silver; font-family: "courier new";">,</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">"TIME_STAMP"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">timestamp</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">(</span><span style="color: #1e1e2c; font-family: "courier new";">6</span><span style="color: maroon; font-family: "courier new";">)</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">DEFAULT</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">sys_extract_utc</span><span style="color: maroon; font-family: "courier new";">(</span><span style="color: maroon; font-family: "courier new";">systimestamp</span><span style="color: maroon; font-family: "courier new";">)</span><span style="color: silver; font-family: "courier new";">,</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">"CATEGORY"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">VARCHAR2</span><span style="color: maroon; font-family: "courier new";">(</span><span style="color: #1e1e2c; font-family: "courier new";">255</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">byte</span><span style="color: maroon; font-family: "courier new";">)</span><span style="color: silver; font-family: "courier new";">,</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">"TYPE"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">VARCHAR2</span><span style="color: maroon; font-family: "courier new";">(</span><span style="color: #1e1e2c; font-family: "courier new";">255</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">byte</span><span style="color: maroon; font-family: "courier new";">)</span><span style="color: silver; font-family: "courier new";">,</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">"SERVERNAME"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">VARCHAR2</span><span style="color: maroon; font-family: "courier new";">(</span><span style="color: #1e1e2c; font-family: "courier new";">255</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">byte</span><span style="color: maroon; font-family: "courier new";">)</span><span style="color: silver; font-family: "courier new";">,</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">"CODE"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">VARCHAR2</span><span style="color: maroon; font-family: "courier new";">(</span><span style="color: #1e1e2c; font-family: "courier new";">255</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">byte</span><span style="color: maroon; font-family: "courier new";">)</span><span style="color: silver; font-family: "courier new";">,</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">"MSG"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">CLOB</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: maroon; font-family: "courier new";">)</span><span style="color: silver; font-family: "courier new";">;</span></div>
<h3>
</h3>
<h3>
Update the collections data config file to use the ClobTransformer</h3>
<div>
<span style="color: maroon; font-family: "courier new";"><</span><span style="color: maroon; font-family: "courier new";">dataconfig</span><span style="color: maroon; font-family: "courier new";">></span><span style="color: #1e1e2c; font-family: "courier new";"> </span><br />
<span style="color: silver; font-family: "courier new";"><</span><span style="color: maroon; font-family: "courier new";">datasource</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">driver</span><span style="color: silver; font-family: "courier new";">=</span><span style="color: maroon; font-family: "courier new";">"oracle.jdbc.OracleDriver"</span><span style="color: #1e1e2c; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">url</span><span style="color: silver; font-family: "courier new";">=</span><span style="color: maroon; font-family: "courier new";">"jdbc:oracle:thin:...[put your connections string here] /></span><br />
<span style="color: red; font-family: "courier new";"><document></span><br />
<span style="color: red; font-family: "courier new";"><entity name = </span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp="">"SOLR_TEST" transformer</entity></document></span><span style="color: maroon; font-family: "courier new";">="</span><span style="color: red; font-family: "courier new";"><b>ClobTransformer</b></span><span style="color: maroon; font-family: "courier new";">" "</span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp="">query="<span style="color: blue;">SELECT</span> ID<span style="color: silver;">, </span><span style="color: #ff0080;"><b>to_char</b></span>(TIME_STAMP<span style="color: silver;">,</span><span style="color: red;">'YYYY-MM-DD&quot;T&quot;HH24:MI:SS&quot;Z&quot;'</span>) <span style="color: blue;">AS</span> TIME_STAMP<span style="color: silver;">,</span> CATEGORY, TYPE, SERVERNAME, CODE, MSG <span style="color: blue;">FROM</span> SOLR_TEST" deltaImportQuery="<span style="color: blue;">SELECT</span> ID<span style="color: silver;">,</span> </entity></document></span><span style="color: #ff0080; font-family: "courier new";"><b>to_char</b></span><span style="color: maroon; font-family: "courier new";">(</span><span style="color: maroon; font-family: "courier new";">TIME_STAMP</span><span style="color: silver; font-family: "courier new";">,</span><span style="color: red; font-family: "courier new";">'YYYY-MM-DD&quot;T&quot;HH24:MI:SS&quot;Z&quot;'</span><span style="color: maroon; font-family: "courier new";">)</span><span style="color: maroon; font-family: "courier new";"> </span><span style="color: blue; font-family: "courier new";">AS</span><span style="color: maroon; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">TIME_STAMP</span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><span style="color: silver;">,</span> </entity></document></span><span style="color: maroon; font-family: "courier new";">CATEGORY, TYPE, SERVERNAME, CODE, MSG </span><span style="color: blue; font-family: "courier new";">FROM</span><span style="color: maroon; font-family: "courier new";"> </span><span style="color: maroon; font-family: "courier new";">SOLR_TEST </span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><span style="color: blue;">WHERE</span> ID <span style="color: silver;">=</span> <span style="color: silver;">$</span>{dataimporter.delta.ID}" deltaQuery="<span style="color: blue;">SELECT</span> ID</entity></document></span></div>
<div>
<span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><span style="color: blue;">FROM </span>SOLR_TEST <span style="color: blue;">WHERE</span> </entity></document></span><span style="color: maroon; font-family: "courier new";">TIME_STAMP </span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><span style="color: silver;">></span> <span style="color: #ff0080;"><b>TO_TIMESTAMP</b></span>(<span style="color: red;">'${dataimporter.last_index_time}'</span><span style="color: silver;">,</span> <span style="color: red;">'YYYY-MM-DD HH24:MI:SS'</span>)"></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><br /></field></entity></document></span><span style="color: #1e1e2c; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif;"></span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column = "ID" name="id" /></field></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><field column = "TIME_STAMP</span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp="">" name="time_stamp" /></field></field></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><field column = "</span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp="">CATEGORY" name="category" /></field></field></field></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><field column = "</span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp="">TYPE" name="type" /></field></field></field></field></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><field column = "SERVERNAME</span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp="">" name="servername" /></field></field></field></field></field></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><field column = "CODE</span><span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp="">" name="code" /></field></field></field></field></field></field></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><field column = "MSG</span><span style="color: #1e1e2c; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><span style="color: maroon;">" name="</span><span style="color: maroon;">msg" clob="true" sourceColName="MSG"</span><span style="color: red;"> </span></field></field></field></field></field></field></field></entity></document></span><span style="color: maroon; font-family: "courier new";">/></span><br />
<span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><span style="color: red;"></entity></span></field></field></field></field></field></field></field></entity></document></span><br />
<span style="color: maroon; font-family: "courier new";"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><span style="color: red;"></document></span></field></field></field></field></field></field></field></entity></document></span></div>
<div>
<span style="color: maroon; font-family: "courier new";"></</span><span style="color: maroon; font-family: "courier new";">dataconfig</span><span style="color: maroon; font-family: "courier new";">></span></div>
<h3>
</h3>
<h3>
Update the managed-schema to use a TextField data type</h3>
<br />
<span style="color: #1e1e2c; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif;">...</span><br />
<span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"msg"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"text_en"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> /></span><br />
<span style="color: #1e1e2c; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif;">...</span><br />
<span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">fieldType</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"text_en"</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"<b>solr.TextField</b>"</span> <span class="hljs-attr" style="color: red;">positionIncrementGap</span>=<span class="hljs-string" style="color: #a31515;">"100"</span>></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">analyzer</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"index"</span>></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">tokenizer</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.StandardTokenizerFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.StopFilterFactory"</span> <span class="hljs-attr" style="color: red;">ignoreCase</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">words</span>=<span class="hljs-string" style="color: #a31515;">"lang/stopwords_en.txt"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.LowerCaseFilterFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.EnglishPossessiveFilterFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.KeywordMarkerFilterFactory"</span> <span class="hljs-attr" style="color: red;">protected</span>=<span class="hljs-string" style="color: #a31515;">"protwords.txt"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.PorterStemFilterFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"></<span class="hljs-name">analyzer</span>></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">analyzer</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"query"</span>></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">tokenizer</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.StandardTokenizerFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.SynonymGraphFilterFactory"</span> <span class="hljs-attr" style="color: red;">synonyms</span>=<span class="hljs-string" style="color: #a31515;">"synonyms.txt"</span> <span class="hljs-attr" style="color: red;">ignoreCase</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">expand</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.StopFilterFactory"</span> <span class="hljs-attr" style="color: red;">ignoreCase</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">words</span>=<span class="hljs-string" style="color: #a31515;">"lang/stopwords_en.txt"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.LowerCaseFilterFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.EnglishPossessiveFilterFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.KeywordMarkerFilterFactory"</span> <span class="hljs-attr" style="color: red;">protected</span>=<span class="hljs-string" style="color: #a31515;">"protwords.txt"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">filter</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.PorterStemFilterFactory"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"></<span class="hljs-name">analyzer</span>></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"></<span class="hljs-name">fieldType</span>></span><br />
<span style="color: #1e1e2c; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif;">...</span><br />
<br />
<i><span style="font-size: x-small;">Full disclosure, I built off of the "example-data-driven-schema" and the field type comes directly from there.</span></i><br />
<i><span style="font-size: x-small;"><br /></span></i>
That's it! You can now run a full import, make some additions, and run a delta. The most difficult part of this process, for me, was getting 4000+ characters inserted into the clob field in the Oracle table. Here's what that looks like:<br />
<br />
<br />
<span style="font-family: "courier new";">
<span style="color: blue;">DECLARE</span>
<br /> <span style="color: maroon;">my_clob</span> <span style="color: black;"><i>CLOB</i></span><span style="color: silver;">;</span>
<br /><span style="color: blue;">BEGIN</span>
<br /> <span style="color: maroon;">my_clob</span> <span style="color: blue;">:=</span> <span style="color: red;">'4000 plus characters...'</span><span style="color: silver;">;</span>
<br /> <span style="color: blue;">INSERT</span> <span style="color: blue;">INTO</span> <span style="color: silver;">[</span><span style="color: blue;">SCHEMA</span><span style="color: silver;">]</span><span style="color: silver;">.</span><span style="color: maroon;">solr_test</span>
<br /> <span style="color: maroon;">(</span>
<br /> <span style="color: maroon;">category</span><span style="color: silver;">,</span>
<br /> <span style="color: blue;">TYPE</span><span style="color: silver;">,</span>
<br /> <span style="color: maroon;">servername</span><span style="color: silver;">,</span>
<br /> <span style="color: maroon;">code</span><span style="color: silver;">,</span>
<br /> <span style="color: maroon;">msg</span>
<br /> <span style="color: maroon;">)</span>
<br /> <span style="color: blue;">VALUES</span>
<br /> <span style="color: maroon;">(</span>
<br /> <span style="color: red;">'Manual'</span><span style="color: silver;">,</span>
<br /> <span style="color: red;">'SQL Insert'</span><span style="color: silver;">,</span>
<br /> <span style="color: red;">'hostname'</span><span style="color: silver;">,</span>
<br /> <span style="color: red;">'BEA-100000'</span><span style="color: silver;">,</span>
<br /> <span style="color: maroon;">my_clob</span>
<br /> <span style="color: maroon;">)</span><span style="color: silver;">;</span>
<br />
<br /><span style="color: blue;">END</span><span style="color: silver;">;</span>
</span>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com1tag:blogger.com,1999:blog-6818257984018916419.post-47929918389600609632018-06-21T12:17:00.001-07:002018-06-22T14:04:36.886-07:00Oracle TIMESTAMP for use with Solr's Data Import Handler<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGiLIsMXM1JooVgX3_nj6dQ5n4t1lBuwNOSADWaB1epiej_97jcOOhCcHSiQaUnDeBe7BR2gCl1j5aE0JjPfdvdgyuBajhZn68b9fxjQB9lmyzKdyOJW1Q3Mx7tYix-lVlnK3P-9aNTyE/s1600/oracle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="183" data-original-width="275" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGiLIsMXM1JooVgX3_nj6dQ5n4t1lBuwNOSADWaB1epiej_97jcOOhCcHSiQaUnDeBe7BR2gCl1j5aE0JjPfdvdgyuBajhZn68b9fxjQB9lmyzKdyOJW1Q3Mx7tYix-lVlnK3P-9aNTyE/s1600/oracle.png" /></a><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiESKXG5cEk5HDsOCowNIeuW6uXWsmWKZcstRyy2yOURzUmT9Bn0ftJ_dfTD5bOPKUNg2NeJiDewRqQ9u-nW0sadORr_t0EDMLfI57cgRoQJ7axMOAU12UcuVFOVVc93mPlOoRZwmIdfWE/s200/solr.jpg" /></div>
<br />
The purpose of this article is to walk through how to setup an Oracle table with a TIMESTAMP column and use that column as the predicate for Solr's <a href="https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html">Data Import Handler</a> (DIH). In addition you'll likely want to use your TIMESTAMP column as a data point in Solr for sorting, faceting, etc.<br />
<br />
The first thing I thought about was how to get the data stored in Oracle so that the native format would support Solr's DIH without having to do any conversions. The thought here is that it will run more quickly by avoiding function calls in the SELECT statement. That said I found that there was not a way to do this and that datetime format conversions are required. This is a simple test and helped me to get familiar with the intricacies of working with dates, including using dates for the delta imports. For real world projects we'll likely have to deal with a variety of datetime formats, and come up with creative solutions for tables that do not have timestamps.<br />
<br />
For this testing, the first thing you'll want to do is create a column with a timestamp datatype. Because the defaults in Solr are UTC I've proceeded forward in the same fashion. I created a test table as follows:<br />
<br />
<span style="color: blue; font-family: "courier new"; font-size: small;">CREATE</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">TABLE</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">"[schema]"</span><span style="color: silver; font-family: "courier new"; font-size: small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: small;">"SOLR_TEST"</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">"ID"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">NUMBER</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">generated</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">always</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">AS</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">identity</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">"TIME_STAMP"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">timestamp</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="font-family: "courier new"; font-size: small;">6</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">DEFAULT</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">sys_extract_utc</span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: small;">systimestamp</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">"CATEGORY"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">VARCHAR2</span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="font-family: "courier new"; font-size: small;">255</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">byte</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">"TYPE"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">VARCHAR2</span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="font-family: "courier new"; font-size: small;">255</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">byte</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">"SERVERNAME"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">VARCHAR2</span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="font-family: "courier new"; font-size: small;">255</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">byte</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">"CODE"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">VARCHAR2</span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="font-family: "courier new"; font-size: small;">255</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">byte</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">"MSG"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">VARCHAR2</span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="font-family: "courier new"; font-size: small;">255</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">byte</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: silver; font-family: "courier new"; font-size: small;">;</span><br />
<span style="color: silver; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: silver; font-family: "courier new"; font-size: x-small;"></span><br />
The ID and TIME_STAMP columns with be automatically generated. The "ID" will have a number starting at 1 and increasing by 1 each insert. The "TIME_STAMP" will get a UTC date and time value generated at the time of insert. An example value is as follows:<br />
<span style="color: red; font-family: "courier new"; font-size: small;">'19-JUN-18 09.23.58.692808000 PM'</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<h3>
</h3>
<h3>
</h3>
<h3>
DIH Delta Query</h3>
<div>
Information on getting started with the DIH can be found <a href="https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#configuring-the-dih-configuration-file">here</a>.</div>
<div>
<br /></div>
We'll need to configure our deltaQuery to compare our Oracle TIME_STAMP column with Solr's variable: ${dataimporter.last_index_time}. An example value for this variable is as follows:<br />
<span style="color: red; font-family: "courier new"; font-size: small;">'2018-06-19 20:31:20'</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<br />
So, we need to get the conversion configured. From a strictly Oracle perspective the following query will result in a format that matches the "TIME_STAMP" column as defined above:<br />
<br />
<span style="color: blue; font-family: "courier new"; font-size: small;">SELECT</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: small;"><b>TO_TIMESTAMP</b></span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="color: red; font-family: "courier new"; font-size: small;">'2018-06-19 20:31:20'</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: red; font-family: "courier new"; font-size: small;">'YYYY-MM-DD HH24:MI:SS'</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: small;">FROM</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">dual</span><span style="color: silver; font-family: "courier new"; font-size: small;">;</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<br />
Based on this the deltaQuery can therefore be configured, in the dataConfig file, as such:<br />
<br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">deltaquery</span><span style="color: silver; font-family: "courier new"; font-size: small;">=</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">"</span><span style="color: blue; font-family: "courier new"; font-size: small;">SELECT</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">ID</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">FROM</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">SOLR_TEST</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">WHERE</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">TIME_STAMP</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: silver; font-family: "courier new"; font-size: small;">></span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;"><b style="color: #ff0080;">TO_TIMESTAMP</b>(</span><span style="color: red; font-family: "courier new"; font-size: small;">'${dataimporter.last_index_time}'</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: red; font-family: "courier new"; font-size: small;">'YYYY-MM-DD HH24:MI:SS'</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: maroon; font-family: "courier new"; font-size: small;">"</span><br />
<h3>
</h3>
<h3>
</h3>
<h3>
DIH Query and Managed Schema</h3>
<div>
In order to import your TIME_STAMP column for use in solr you'll need to conform to Solr's preferred format. This is detailed <a href="https://lucene.apache.org/solr/guide/6_6/working-with-dates.html#working-with-dates">here</a>. An example value is as follows:</div>
<div>
<span style="color: red; font-family: "courier new"; font-size: small;">'2018-06-19T21:09:39Z'</span><span style="font-family: "courier new"; font-size: small;"> </span></div>
<br />
Again, from a strictly Oracle perspective the following query will result in a format that matches the Solr timestamp format:<br />
<br />
<span style="color: blue; font-family: "courier new"; font-size: small;">SELECT</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: small;"><b>To_char</b></span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="color: #ff0080; font-family: "courier new"; font-size: small;"><b>To_timestamp</b></span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="color: red; font-family: "courier new"; font-size: small;">'19-JUN-18 09.23.58.692808000 PM'</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="font-family: "courier new"; font-size: small;"> </span><span style="color: red; font-family: "courier new"; font-size: small;">'YYYY-MM-DD"T"HH24:MI:SS"Z"'</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: small;">FROM</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">dual</span><span style="color: silver; font-family: "courier new"; font-size: small;">;</span><br />
<br />
Because our Oracle "TIME_STAMP" column is already a "TIMESTAMP" datatype, we can skip the to_timestamp function in our DIH query, which looks like this:<br />
<br />
<span style="color: maroon; font-family: "courier new"; font-size: small;">query</span><span style="color: silver; font-family: "courier new"; font-size: small;">=</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">"</span><span style="color: blue; font-family: "courier new"; font-size: small;">SELECT</span><span style="font-family: "courier new"; font-size: small;"> <span style="color: maroon;">ID</span></span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: small;"><b>to_char</b></span><span style="color: maroon; font-family: "courier new"; font-size: small;">(TIME_STAMP</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: red; font-family: "courier new"; font-size: small;">'YYYY-MM-DD&quot;T&quot;HH24:MI:SS&quot;Z&quot;'</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">AS</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">TIME_STAMP</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">CATEGORY</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">TYPE</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">SERVERNAME</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">CODE</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">MSG</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">FROM</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">SOLR_TEST</span><span style="color: maroon; font-family: "courier new"; font-size: small;">"</span><br />
<br />
You'll notice "<span style="color: red; font-family: "courier new";">&quot;</span>" in there a couple of times. This is the XML encoding for the double quote. I've seen this documented differently on where and what, in the data config file, needs to be XML encoded vs. what does not. All I can say is be aware of this as a potential issue, and that the above in-line double quote does need to be encoded.<br />
<br />
I did test this with the '<span style="color: silver; font-family: "courier new"; font-size: x-small;">></span>' symbol in the <span style="color: maroon; font-family: "courier new";">deltaquery </span>using both '<span style="color: silver; font-family: "courier new"; font-size: x-small;">></span>' and '<span style="color: #dd1144; font-family: monospace; font-size: small; white-space: pre;">&gt;</span>'. Both worked.<br />
<br />
This formatting is done specifically to get Solr to recognize these values as dates so that they can then be subsequently used in sorting and faceting. That said, here's an excerpt of the definitions for the related fields from the "managed-schema" file:<br />
<br />
...<br />
<span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"id"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"tint"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">required</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> <span class="hljs-attr" style="color: red;">docValues</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"time_stamp"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"date"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> </span><span class="hljs-attr" style="color: red; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">docValues</span><span style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">=</span><span class="hljs-string" style="color: #a31515; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">"true"</span><span style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"> </span><span style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">/></span><br />
<span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"category"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"string"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"type"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"string"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"servername"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"string"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"code"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"string"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">field</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"msg"</span> <span class="hljs-attr" style="color: red;">type</span>=<span class="hljs-string" style="color: #a31515;">"string"</span> <span class="hljs-attr" style="color: red;">indexed</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">stored</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> <span class="hljs-attr" style="color: red;">multiValued</span>=<span class="hljs-string" style="color: #a31515;">"false"</span> /></span><br />
...<br />
<span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">fieldType</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"tint"</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.TrieIntField"</span> <span class="hljs-attr" style="color: red;">precisionStep</span>=<span class="hljs-string" style="color: #a31515;">"8"</span> <span class="hljs-attr" style="color: red;">positionIncrementGap</span>=<span class="hljs-string" style="color: #a31515;">"0"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">fieldType</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"date"</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.TrieDateField"</span> <span class="hljs-attr" style="color: red;">precisionStep</span>=<span class="hljs-string" style="color: #a31515;">"0"</span> <span class="hljs-attr" style="color: red;">positionIncrementGap</span>=<span class="hljs-string" style="color: #a31515;">"0"</span> /></span><span style="background-color: white; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;">
</span><span class="hljs-tag" style="color: blue; font-family: "monaco" , "andale mono" , "courier new" , monospace; white-space: pre-wrap;"><<span class="hljs-name">fieldType</span> <span class="hljs-attr" style="color: red;">name</span>=<span class="hljs-string" style="color: #a31515;">"string"</span> <span class="hljs-attr" style="color: red;">class</span>=<span class="hljs-string" style="color: #a31515;">"solr.StrField"</span> <span class="hljs-attr" style="color: red;">sortMissingLast</span>=<span class="hljs-string" style="color: #a31515;">"true"</span> /></span><br />
...<br />
<br />
Youl'll notice that docValues are set to true for the id and time_stamp. <a href="https://lucene.apache.org/solr/guide/6_6/docvalues.html#why-docvalues">Here's why</a>.<br />
<br />
In addition here's the full data config file:<br />
<span style="font-family: "courier new"; font-size: small;"><</span><span style="color: maroon; font-family: "courier new"; font-size: small;">dataconfig</span><span style="color: silver; font-family: "courier new"; font-size: small;">></span><span style="font-family: "courier new"; font-size: small;"> </span><br />
<span style="color: silver; font-family: "courier new"; font-size: small;"><</span><span style="color: maroon; font-family: "courier new"; font-size: small;">datasource</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">driver</span><span style="color: silver; font-family: "courier new"; font-size: small;">=</span><span style="color: maroon; font-family: "courier new"; font-size: small;">"oracle.jdbc.OracleDriver"</span><span style="font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">url</span><span style="color: silver; font-family: "courier new"; font-size: small;">=</span><span style="color: maroon; font-family: "courier new"; font-size: small;">"jdbc:oracle:thin:...[put your connections string here] /></span><br />
<span style="color: red; font-family: "courier new"; font-size: small;"><document></span><br />
<span style="color: red; font-family: "courier new"; font-size: small;"><entity name = </span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;">"SOLR_TEST</span><span style="color: maroon;">" query="</span><span style="color: blue;">SELECT</span> ID<span style="color: silver;">, </span><span style="color: #ff0080;"><b>to_char</b></span><span style="color: maroon;">(</span><span style="color: maroon;">TIME_STAMP</span><span style="color: silver;">,</span><span style="color: red;">'YYYY-MM-DD&quot;T&quot;HH24:MI:SS&quot;Z&quot;'</span><span style="color: maroon;">)</span> <span style="color: blue;">AS</span> <span style="color: maroon;">TIME_STAMP</span><span style="color: silver;">,</span> CATEGORY, TYPE, SERVERNAME, CODE, MSG <span style="color: blue;">FROM</span> <span style="color: maroon;">SOLR_TEST</span><span style="color: maroon;">" deltaImportQuery="</span><span style="color: blue;">SELECT</span> <span style="color: maroon;">ID</span><span style="color: silver;">,</span> </entity></document></span><span style="color: #ff0080; font-family: "courier new"; font-size: small;"><b>to_char</b></span><span style="color: maroon; font-family: "courier new"; font-size: small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: small;">TIME_STAMP</span><span style="color: silver; font-family: "courier new"; font-size: small;">,</span><span style="color: red; font-family: "courier new"; font-size: small;">'YYYY-MM-DD&quot;T&quot;HH24:MI:SS&quot;Z&quot;'</span><span style="color: maroon; font-family: "courier new"; font-size: small;">)</span><span style="color: maroon; font-family: "courier new"; font-size: small;"> </span><span style="color: blue; font-family: "courier new"; font-size: small;">AS</span><span style="color: maroon; font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">TIME_STAMP</span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: silver;">,</span> </entity></document></span><span style="color: maroon; font-family: "courier new"; font-size: small;">CATEGORY, TYPE, SERVERNAME, CODE, MSG </span><span style="color: blue; font-family: "courier new"; font-size: small;">FROM</span><span style="color: maroon; font-family: "courier new"; font-size: small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: small;">SOLR_TEST </span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: blue;">WHERE</span> ID <span style="color: silver;">=</span> <span style="color: silver;">$</span><span style="color: maroon;">{dataimporter.delta.ID}</span><span style="color: maroon;">" deltaQuery="</span><span style="color: blue;">SELECT</span> ID<br /><span style="color: blue;">FROM </span><span style="color: maroon;">SOLR_TEST </span><span style="color: blue;">WHERE</span> </entity></document></span><span style="color: maroon; font-family: "courier new"; font-size: small;">TIME_STAMP </span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: silver;">></span> <span style="color: #ff0080;"><b>TO_TIMESTAMP</b></span><span style="color: maroon;">(</span><span style="color: red;">'${dataimporter.last_index_time}'</span><span style="color: silver;">,</span> <span style="color: red;">'YYYY-MM-DD HH24:MI:SS'</span><span style="color: maroon;">)</span><span style="color: maroon;">"></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><br /></span></field></span></entity></document></span>
<span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column = "ID</span><span style="color: maroon;">" name="</span><span style="color: maroon;">id</span><span style="color: maroon;">" /></span></field></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><field column = "TIME_STAMP</span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;">" name="</span><span style="color: maroon;">time_stamp</span><span style="color: maroon;">" /></span></field></span></field></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><field column = "</span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;">CATEGORY</span><span style="color: maroon;">" name="</span><span style="color: maroon;">category</span><span style="color: maroon;">" /></span></field></span></field></span></field></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><field column = "</span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;">TYPE</span><span style="color: maroon;">" name="</span><span style="color: maroon;">type</span><span style="color: maroon;">" /></span></field></span></field></span></field></span></field></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><field column = "SERVERNAME</span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;">" name="</span><span style="color: maroon;">servername</span><span style="color: maroon;">" /></span></field></span></field></span></field></span></field></span></field></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><field column = "CODE</span><span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;">" name="</span><span style="color: maroon;">code</span><span style="color: maroon;">" /></span></field></span></field></span></field></span></field></span></field></span></field></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><field column = "MSG</span><span style="font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><field column=""<" font="" nbsp=""><span style="color: maroon;">" name="</span><span style="color: maroon;">msg"</span><span style="color: red;"> </span></field></field></field></field></field></field></field></entity></document></span><span style="color: maroon; font-family: "courier new"; font-size: small;">/></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: red;"></entity></span></field></span></field></span></field></span></field></span></field></span></field></span></field></span></entity></document></span><br />
<span style="color: maroon; font-family: "courier new"; font-size: small;"><document><entity font="" name=""<" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: maroon;"><field column=""<" font="" nbsp=""><span style="color: red;"></document></span></field></span></field></span></field></span></field></span></field></span></field></span></field></span></entity></document></span><br />
<br />
You'll notice that the "ID" coming from Oracle is capitalized and the "id" in solr is lower case. This DOES make a difference in the data config queries. Specifically my delta imports were failing until I changed "<span style="color: maroon; font-family: "courier new";">dataimporter.delta.id</span>" to "<span style="color: maroon; font-family: "courier new";">dataimporter.delta.ID</span>". Basically the <span style="color: maroon; font-family: "courier new";">deltaImportQuery </span>is using the results from the <span style="color: maroon; font-family: "courier new";">deltaquery</span>, so the name and capitalization need to match.<br />
<br />
<h4>
P.S.</h4>
I should probably mention that when running your delta-import, you'll most likely want to set clean=false, otherwise you'll loose those documents you loaded with your full-import.<br />
<h4>
</h4>
<h4>
<br /></h4>
<h4>
For your reference:</h4>
How Solr likes datetime:<br />
<a href="https://lucene.apache.org/solr/guide/6_6/working-with-dates.html#working-with-dates">https://lucene.apache.org/solr/guide/6_6/working-with-dates.html#working-with-dates</a><br />
<br />
Oracle datetime formatting:<br />
<a href="https://docs.oracle.com/database/121/SQLRF/sql_elements004.htm#CDEHIFJA">https://docs.oracle.com/database/121/SQLRF/sql_elements004.htm#CDEHIFJA</a><br />
<br />
The dates solr uses for delta comparisons:<br />
<span style="color: red; font-family: "courier new";">${dataimporter.last_index_time</span><br />
This is also saved in a file called "dataimport.properties" which, in my SolrCloud configuration, is found in zookeeper under the 'configs' folder for the collection.Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com2tag:blogger.com,1999:blog-6818257984018916419.post-47633479650542657362018-05-17T15:41:00.000-07:002018-05-17T19:32:28.673-07:00Solr DIH Configuration File with an encrypted password<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiESKXG5cEk5HDsOCowNIeuW6uXWsmWKZcstRyy2yOURzUmT9Bn0ftJ_dfTD5bOPKUNg2NeJiDewRqQ9u-nW0sadORr_t0EDMLfI57cgRoQJ7axMOAU12UcuVFOVVc93mPlOoRZwmIdfWE/s1600/solr.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="300" data-original-width="300" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiESKXG5cEk5HDsOCowNIeuW6uXWsmWKZcstRyy2yOURzUmT9Bn0ftJ_dfTD5bOPKUNg2NeJiDewRqQ9u-nW0sadORr_t0EDMLfI57cgRoQJ7axMOAU12UcuVFOVVc93mPlOoRZwmIdfWE/s200/solr.jpg" width="200" /></a></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px;">
The data import handler is configured in solrconfig.xml via a requestHandler, name="/dataimport", which references a DIH configuration document of your choosing.</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
solrconfig.xml example:</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
<span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"><<span class="hljs-name">requestHandler</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">name</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"/dataimport"</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">class</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"solr.DataImportHandler"</span>></span><span data-mce-style="color: #000000;" style="color: black;"> </span></div>
<div data-mce-style="margin-left: 30.0px;" style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-left: 30px; margin-top: 10px;">
<span data-mce-style="color: #000000;" style="color: black;"> </span><span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"><<span class="hljs-name">lst</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">name</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"defaults"</span>></span></div>
<div data-mce-style="margin-left: 60.0px;" style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-left: 60px; margin-top: 10px;">
<span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"> </span><span data-mce-style="color: #000000;" style="color: black;"> </span><span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"><<span class="hljs-name">str</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">name</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"config"</span>></span><span data-mce-style="color: #000000;" style="color: black;">config.xml</span><span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"></<span class="hljs-name">str</span>></span></div>
<div data-mce-style="margin-left: 30.0px;" style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-left: 30px; margin-top: 10px;">
<span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"> </span><span data-mce-style="color: #000000;" style="color: black;"> </span><span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"></<span class="hljs-name">lst</span>></span><span data-mce-style="color: #000000;" style="color: black;"> </span></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
<span data-mce-style="color: #000000;" style="color: black;"> </span><span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"></<span class="hljs-name">requestHandler</span>></span></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
The config file has a lot of options, in short this is where you configure a database connection string and reference your jdbc jar file. Full details are <a data-mce-href="https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#configuring-the-dih-configuration-file" href="https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#configuring-the-dih-configuration-file" style="color: #3572b0; text-decoration-line: none;">here</a>. By default any of the examples that come with the Solr distribution use a plain text username and password. This can be potentially viewed from the front end:</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
http://hostname:8983/solr/ > Select Collection from the drop-down > Click data Import > expand configuration</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
Obviously we do not want to store our username and password in plain text. The config file includes an option to encrypt the password and then store the key in a separate file. <em>(If you're interested in the contributors discussing the security implementation, there are more details <a data-mce-href="https://issues.apache.org/jira/browse/SOLR-4392?focusedCommentId=16218863&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16218863" href="https://issues.apache.org/jira/browse/SOLR-4392?focusedCommentId=16218863&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16218863" style="color: #3572b0; text-decoration-line: none;">here</a></em><em>.)</em></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
The process for configuring this encryption is as follows:</div>
<ol style="margin: 10px 0px 0px;">
<li><span style="color: #333333; font-family: "arial" , sans-serif;"><span style="font-size: 14px;">Encrypt your password</span></span><br /><em><span style="color: #333333; font-family: "arial" , sans-serif;"><span style="font-size: 14px;">(-n is very important for the echo commands, it ensures there are no newline characters, which can be </span></span><a href="http://lucene.472066.n3.nabble.com/Problem-with-Password-Decryption-in-Data-Import-Handler-td4299865.html" style="color: #3572b0; font-family: arial, sans-serif; font-size: 14px;">problematic</a><span style="color: #333333; font-family: "arial" , sans-serif;"><span style="font-size: 14px;">. Errors related to neglecting -n </span></span></em><i style="color: #333333; font-family: arial, sans-serif; font-size: 14px;">include: "Error decoding password" and </i><i style="color: #333333; font-family: arial, sans-serif; font-size: 14px;">"Bad password, algorithm, mode or padding")</i><ol style="color: #333333; font-family: arial, sans-serif; font-size: 14px; list-style-type: lower-alpha; margin: 0px;">
<li><pre class="bp-text bp-text-plain hljs"><code class="txt">Write your current DB password to a file</code></pre>
<pre class="bp-text bp-text-plain hljs" style="margin-top: 10px;"><code class="txt">echo -</code>n "mypassword" > /data/solrtmp/collection/conf/pwd.txt
</pre>
</li>
<li><pre class="bp-text bp-text-plain hljs">Encrypt the password:</pre>
<pre class="bp-text bp-text-plain hljs" style="margin-top: 10px;"><code class="txt">openssl enc -aes-128-cbc -a -salt -in /data/solrtmp/collection/conf/</code>pwd.txt</pre>
<div style="margin-top: 10px;">
The result of the above command should be a hashed value, which will be used as the password value in the config file.<br />
During encryption, you will be asked to enter a key.</div>
</li>
<li><pre class="bp-text bp-text-plain hljs"><code class="txt">Write the key, used above to hash the password, to a new file:
<em>(you can name the file anything you like)</em></code></pre>
<pre class="bp-text bp-text-plain hljs" style="margin-top: 10px;"><code class="txt">echo -</code>n "mykey" > /data/solrtmp/collection/conf/key.txt
</pre>
</li>
<li>Remove the plain text password file:<br />rm pwd.txt</li>
</ol>
</li>
<li style="color: #333333; font-family: arial, sans-serif; font-size: 14px;">Configure file permissions, ensuring only the solr account can access this file:<br /><br />sudo chown solr:solr key.txt<br />run as solr:<br />chmod 600 key.txt</li>
<li style="color: #333333; font-family: arial, sans-serif; font-size: 14px;">Copy the decryption key to all servers, or repeat the above steps on each server<br /><em>(Any directory will work, just make sure it's the same across the cluster and the config)</em></li>
<li style="color: #333333; font-family: arial, sans-serif; font-size: 14px;">Put the details in your config file:<br />
<div style="margin-top: 10px;">
In your <span data-mce-style="color: #000000;" style="color: black;">config.xml file (it can be named anything), enter the values for user, password, and encryptKeyFile:</span></div>
<div data-mce-style="margin-left: 30.0px;" style="margin-left: 30px;">
<span data-mce-style="color: #000000;" style="color: black;"><span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"><<span class="hljs-name">dataConfig</span>></span> </span></div>
<div data-mce-style="margin-left: 60.0px;" style="margin-left: 60px;">
<span data-mce-style="color: #000000;" style="color: black;"> <span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"><<span class="hljs-name">dataSource</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">driver</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"oracle.jdbc.OracleDriver"</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">url</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"jdbc:oracle:thin:@.../..."</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">user</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"solrservice"</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">password</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"U2FsdGVkX1/mzOZi9P2iBUPEbtaHo/7SO+nOQTqqHrw="</span> <span class="hljs-attr" data-mce-style="color: #ff0000;" style="color: red;">encryptKeyFile</span>=<span class="hljs-string" data-mce-style="color: #a31515;" style="color: #a31515;">"/data/solrtmp/collection/conf/key.txt"</span> /></span> </span></div>
<div data-mce-style="margin-left: 90.0px;" style="margin-left: 90px;">
<span data-mce-style="color: #000000;" style="color: black;"> <span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"><<span class="hljs-name">document</span>></span> </span></div>
<div data-mce-style="margin-left: 120.0px;" style="margin-left: 120px;">
<span data-mce-style="color: #000000;" style="color: black;">... </span></div>
<div data-mce-style="margin-left: 90.0px;" style="margin-left: 90px;">
<span data-mce-style="color: #000000;" style="color: black;"> <span class="hljs-tag" data-mce-style="color: #0000ff;" style="color: blue;"></<span class="hljs-name">document</span>></span> </span></div>
<div data-mce-style="margin-left: 30.0px;" style="margin-left: 30px;">
<span data-mce-style="color: #000000;" style="color: black;"> </span><span data-mce-style="color: #0000ff;" style="color: blue;"></</span><span class="hljs-name" data-mce-style="color: #0000ff;" style="color: blue;">dataConfig</span><span data-mce-style="color: #0000ff;" style="color: blue;">></span></div>
</li>
<li style="color: #333333; font-family: arial, sans-serif; font-size: 14px;">If you run in solrcloud mode, you will need to upload the <span data-mce-style="color: #000000;" style="color: black;">config.xml file to zookeeper:</span></li>
</ol>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px;">
<span data-mce-style="color: #000000;" style="color: black;"><br /></span></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px;">
<span data-mce-style="color: #000000;" style="color: black;">You now have configured a hashed database password for solr to use with the data-import-handler. You can test things out by attempting to run a full-import</span><span style="color: black;">.</span></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px;">
<span data-mce-style="color: #000000;" style="color: black;"><br /></span></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px;">
<span data-mce-style="color: #000000;" style="color: black;">Commands that may be of interest while running an import:</span></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
<br /></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
Status for a data import can be viewed by executing the following HTTP command:</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
http://hostname:8983/solr/[collection]/dataimport?command=status</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
<br /></div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
If you know the request ID, you can do the following:</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
(<a data-mce-href="https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-async" href="https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-async" style="color: #3572b0; text-decoration-line: none;">https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-async</a>)</div>
<div style="color: #333333; font-family: Arial, sans-serif; font-size: 14px; margin-top: 10px;">
http://hostname:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1526317641074</div>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-61951398849196799592017-03-16T09:54:00.000-07:002017-03-16T10:03:55.517-07:00Knox integration with Active Directory<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqGei2fPb-f5ee1ZjB3au-XZxh8rDV_Mj-_RVgkdJdHSpvKTotQtIwSXxgYoq4JLtajfVQ0IxaFOg8C9ie0LGuve5r1vk_Iu-MFA_C6WBmVBZq3bWkoc-W5Tqgv-KUvNDlDGemwsnZG04/s1600/knox-logo%255B1%255D.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqGei2fPb-f5ee1ZjB3au-XZxh8rDV_Mj-_RVgkdJdHSpvKTotQtIwSXxgYoq4JLtajfVQ0IxaFOg8C9ie0LGuve5r1vk_Iu-MFA_C6WBmVBZq3bWkoc-W5Tqgv-KUvNDlDGemwsnZG04/s1600/knox-logo%255B1%255D.gif" /></a></div>
<br />
I've recently been doing some work with Hadoop using the Hortonworks distribution. Most recently I configured Knox to integrate with Active Directory. The end goal was to be able to authenticate with Active Directory via Knox (a REST API Gateway) and then on to other services like Hive. I also configured Knox to point to Zookeeper (HA service discovery) vs. Hive directly, but that's really more detail than we need for integrating Knox with AD.<br />
<br />
The Knox documentation is really good and very helpful:<br />
<a href="https://knox.apache.org/books/knox-0-9-0/user-guide.html">https://knox.apache.org/books/knox-0-9-0/user-guide.html</a><br />
<br />
The first thing that was done was to test Knox by using the demo LDAP service. <br />
From Ambari > Knox > Service Actions > Start Demo LDAP<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBYXxjjmWnF6rBVThs465Acy81lscq5QKeBh7HfioV44lntwNaeIqwpXNRldb9jrBm9AKNPNRl307FbQVT555Cx-5dPSZAborLJqsLOn1l4yaIiJiiQipnapugGPhPZIRTj5UGcQBHmkc/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBYXxjjmWnF6rBVThs465Acy81lscq5QKeBh7HfioV44lntwNaeIqwpXNRldb9jrBm9AKNPNRl307FbQVT555Cx-5dPSZAborLJqsLOn1l4yaIiJiiQipnapugGPhPZIRTj5UGcQBHmkc/s1600/Capture.PNG" /></a></div>
<br />
I'm going to gloss over this because it's a generic test and pretty simple to figure out. One note, is that you're able to add users to the demo LDAP service via the "Advanced users-ldif" configuration listed in the Knox Configs section of Ambari. The default "guest" and "admin" accounts as well as their plain text passwords are in that config location as well.<br />
<br />
A couple of status commands I found helpful for this:<br />
/usr/hdp/current/knox-server/bin/gateway.sh status<br />
/usr/hdp/current/knox-server/bin/ldap.sh status<br />
(you can also start/stop)<br />
<br />
Once the Knox service was verified, I proceeded to test configuration from Knox to Hive. In order to test this, the Hive Authentication (Ambari > Hive > Configs > HiveServer2 Authentication) was set to LDAP. <br />
<br />
Once that tested successfully the next test was Knox to Hive via Zookeeper. Because I had <a href="http://jonmorisissqlblog.blogspot.com/2016/09/hadoop-amabari-integration-with-active.html">previously enabled Kerberos</a> in my cluster, I needed to change the Hive Authentication to use Kerberos. I had initially been using <a href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_dataintegration/content/beeline-vs-hive-cli.html">beeline </a>to test jdbc connections to Hive, but with Knox you need to test from outside the Hadoop cluster. In order to achieve this goal I went with <a href="http://squirrel-sql.sourceforge.net/">SQuirreL SQL Client</a> on Windows. Any jdbc compliant client will work. I also setup the Hortonworks Hive ODBC Driver and tested from a linked server in SSMS, but I digress. <br />
<br />
OK, so with Knox installed and communication to Hive verified, I proceeded to work through the following article to get Knox and AD working together:<br />
<a href="https://cwiki.apache.org/confluence/display/KNOX/Using+Apache+Knox+with+ActiveDirectory">https://cwiki.apache.org/confluence/display/KNOX/Using+Apache+Knox+with+ActiveDirectory</a><br />
<br />
This documentation is very good and I recommend taking your time, reading it all, and saving each sample file as illustrated. This allows you to go back and reference prior settings.<br />
<br />
The documentation repeatedly references ldapwhoami and ldapsearch. I'll admit I initially attempted to configure all of this without using these utilities and only relied on the Windows tools: Active Directory Users and Computers and ldp.exe. Please take my advice and install the linux clients:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;">yum install openldap-clients</span><br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;"><br /></span>
ldapsearch provides some really useful details and I had fun working with it. Getting a better look at the AD internals that aren't easily determined from Active Directory Users and Computers was really helpful.<br />
<br />
In spite of the documentation being really good, it does make mention that it assumes default locations and that you'll need to get, "correct values for your environment". By using the aforementioned tools and by running through several ldapsearch queries I was eventually able to determine the necessary values.<br />
<br />
I wish I could walk you through what I did exactly, however it's a bit too specific to my environment to post on a blog. The specifics I can give you are that the tools helped me to determine the following values that ended up in my topology file:<br />
<br />
<table border="1">
<tbody>
<tr bgcolor="d3d3d3"><td><b> Parameter </b></td><td><b> Description </b></td></tr>
<tr><td>main.ldapRealm.contextFactory.url </td><td>The hostname where ActiveDirectory is running. </td></tr>
<tr><td>main.ldapRealm.contextFactory.systemUsername </td><td>User running searches </td></tr>
<tr><td>main.ldapRealm.contextFactory.systemPassword </td><td>The password for the system user </td></tr>
<tr><td>main.ldapRealm.userSearchBase </td><td>subset of users to search for authentication </td></tr>
<tr><td>main.ldapRealm.groupSearchBase </td><td>subset of groups to search for user membership </td></tr>
</tbody></table>
<br />
One of the last things I did was to store the password for the systemPassword in a protected credential store. This is briefly <a href="http://knox.apache.org/books/knox-0-6-0/user-guide.html#Special+note+on+parameter+main.ldapRealm.contextFactory.systemPassword">referenced </a>in the beginning of Part 2 in the Using Apache Knox with ActiveDirectory documentation. However, what is not mentioned is that there is a known bug that will result in your inability to test with the knoxcli once this is enabled:<br />
<a href="https://issues.apache.org/jira/browse/KNOX-745">https://issues.apache.org/jira/browse/KNOX-745</a><br />
<br />
That's the reason I did this last. Connections will work and your systemUsername will be able to successfully run authentication quires against AD, but setting this prematurely will cause headaches with knoxcli testing. One other note that's not referenced anywhere, is that for whatever reason the formatting of the XML for the systemPassword parameter doesn't seem to work in the short form that's used in the documentation. That is to say...<br />
<br />
<div class="comment-tools hidden" style="-webkit-text-stroke-width: 0px; background-color: white; color: #333333; display: inline-block; float: right; font-family: Cabin, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 14px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; visibility: hidden; white-space: normal; widows: 2; word-spacing: 0px;">
</div>
<div class="control-bar" nodeid="88060" style="-webkit-text-stroke-width: 0px; background-color: white; clear: both; color: #333333; font-family: Cabin, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 14px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; margin-bottom: 10px; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
</div>
<br />
<div class="answer-body" style="-webkit-text-stroke-width: 0px; background-color: white; color: #333333; font-family: Cabin, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 14px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; min-height: 115px; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; word-wrap: break-word;">
<div style="margin: 0px 0px 10px;">
<span style="color: black; font-family: "times new roman"; font-size: small;">This works:</span></div>
<pre class="prettyprint linenums" style="background-color: whitesmoke; border-radius: 0px; border: 1px solid rgba(0, 0, 0, 0.14902); color: #333333; display: block; font-family: Menlo, Monaco, Consolas, "Courier New", monospace; font-size: 13px; line-height: 20px; margin: 0px 0px 20px; overflow: auto; padding: 2px 2px 2px 15px; white-space: pre-wrap; word-break: break-all; word-wrap: break-word;"><ol class="" style="margin: 0px 0px 10px 25px; padding: 0px;">
<li class="L0" style="line-height: 20px; list-style-type: none;"><span class="tag" style="color: #000088;"><param></span></li>
<li class="L1" style="background: rgb(238, 238, 238); line-height: 20px; list-style-type: none;"><span class="pln" style="color: black;"> </span><span class="tag" style="color: #000088;"><name></span><span class="pln" style="color: black;">main.ldapRealm.contextFactory.systemPassword</span><span class="tag" style="color: #000088;"></name></span></li>
<li class="L2" style="line-height: 20px; list-style-type: none;"><span class="pln" style="color: black;"> </span><span class="tag" style="color: #000088;"><value></span><span class="pln" style="color: black;">${ALIAS=ldcSystemPassword}</span><span class="tag" style="color: #000088;"></value></span></li>
<li class="L3" style="background: rgb(238, 238, 238); line-height: 20px; list-style-type: none;"><span class="tag" style="color: #000088;"></param></span></li>
</ol>
</pre>
<div style="margin: 0px 0px 10px;">
<span style="color: black; font-family: "times new roman"; font-size: small;">This does not:</span></div>
<pre class="prettyprint linenums" style="background-color: whitesmoke; border-radius: 0px; border: 1px solid rgba(0, 0, 0, 0.14902); color: #333333; display: block; font-family: Menlo, Monaco, Consolas, "Courier New", monospace; font-size: 13px; line-height: 20px; margin: 0px 0px 20px; overflow: auto; padding: 2px 2px 2px 15px; white-space: pre-wrap; word-break: break-all; word-wrap: break-word;"><ol class="" style="margin: 0px 0px 10px 25px; padding: 0px;">
<li class="L0" style="line-height: 20px; list-style-type: none;"><span class="tag" style="color: #000088;"><param</span><span class="pln" style="color: black;"> </span><span class="atn" style="color: #660066;">name</span><span class="pun" style="color: #666600;">=</span><span class="atv" style="color: #008800;">"main.ldapRealm.contextFactory.systemPassword"</span><span class="pln" style="color: black;"> </span><span class="atn" style="color: #660066;">value</span><span class="pun" style="color: #666600;">=</span><span class="atv" style="color: #008800;">${ALIAS=ldcSystemPassword}</span><span class="tag" style="color: #000088;">/></span></li>
</ol>
</pre>
</div>
If you want to configure LDAPS, there's only a few changes needed.<br />
First, get your private key:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;">openssl s_client -connect [Server]:[Port]</span><br />
<br />
Second, import your private key into the JAVA_HOME keystore that Knox is using:<br />
<br />
<ul>
<li>Find the java home directory:<br /><span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;">cat gateway.log | grep java.home</span></li>
<li>Import the private key:<br /><span style="background-color: #f2f2f2;"><span style="color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif;"><span style="font-size: 14px;">keytool -importcert -noprompt -storepass [PW] -file [PrivKeyFile] -alias [ALIAS] -keystore [KEYSTORE]</span></span></span></li>
</ul>
Next, change main.ldapRealm.contextFactory.url to your LDAPS URL in your topology file.<br />
<br />
Last, restart the Knox service from Ambari and test.<br />
<br />
It is important to note that when Knox is restarted the details from the following location overwrites the default.xml topology file in the Knox {GATEWAY_HOME}/conf/topologies directory:<br />
Ambari > Services > Knox > Configs > Advanced topology<br />
<br />
That is to say you should copy and paste your final topology information into the Advanced topology section as part of the restart test.<br />
<br />
Since I mentioned Zookeeper in the beginning, I'll also let you in on how that's configured in my topology file. The documentation walks you through multiple samples. Sample 7 leaves it to you to determine your service values. (I copied mine from the default topology). Any services that run through zookeeper needs to be defined according to the <a href="https://community.hortonworks.com/articles/72431/setup-knox-over-highly-available-hiveserver2-insta.html">Hortonworks Community</a> like so:<br />
<br />
<div class="answer-body" style="background-color: white; min-height: 115px; word-wrap: break-word;">
<pre class="prettyprint linenums" style="background-color: whitesmoke; border-radius: 0px; border: 1px solid rgba(0, 0, 0, 0.14902); line-height: 20px; margin-bottom: 20px; overflow: auto; padding: 2px 2px 2px 15px; word-break: break-all; word-wrap: break-word;"><ol class="" style="margin: 0px 0px 10px 25px; padding: 0px;">
<li class="L0" style="line-height: 20px; list-style-type: none;"><span style="color: #000088; font-family: "menlo" , "monaco" , "consolas" , "courier new" , monospace;"><span style="white-space: pre-wrap;"><provider>
<role>ha</role>
<name>HaProvider</name>
<enabled>true</enabled>
<param />
<name>HIVE</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=machine1:2181,machine2:2181,machine3:2181;
zookeeperNamespace=hiveserver2</value>
</provider></span></span></li>
</ol>
</pre>
</div>
<br />Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-70056183016523589362016-09-12T14:16:00.000-07:002016-09-12T14:16:50.766-07:00Hadoop in real world (UDEMY online course)In taking the Hadoop in real world online course at <a href="https://www.udemy.com/courses/">udemy.com</a>, I ran into a couple of issues / items to work through in order to be successful with several of the lectures.<br />
<br />
For starters the course relies on using Eclipse, JAVA, and files provided by the course. The lectures in Section 1 walk through the tools you're going to need. In order to proceed with the course I installed the following software:<br />
<br />
<a href="https://winscp.net/eng/download.php">WinSCP-5.9-Portable</a><br />
<ul>
<li>I used this to download the hirw-workshop folder from the lab server</li>
</ul>
Eclipse Java EE IDE<br />
<ul>
<li>The use of codenames can make things very confusing for first time users<br />There's another breakdown <a href="http://www.eclipse.org/eclipse/development/">here</a>.</li>
<li>I managed to end up with the Kepler version. <br />(If you pay attention in lecture 2 you'll see the instructor uses the "mars" version.)</li>
</ul>
I had trouble getting the eclipse packages to perform the maven build successfully. This happened for multiple reasons.<br />
<br />
The first issue I ran into was that I did not yet have JAVA installed. The <a href="http://www.oracle.com/technetwork/java/javase/overview/index.html">JAVA selection of versions</a> is again a veritable quandary. Do you want SE, EE, ME? ...then the JDK or JRE version?<br />
I went with <a href="http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html">Java SE Development Kit 8</a> (JDK). <br />
<br />
Once that was setup I needed remove the comment from the pom.xml files to include the jdk.tools. However, the install of JAVA did not create the environment variables so I had to set them up manually:<br />
<br />
How to setup the Java environment variables:<br />
Control Panel\All Control Panel Items\System > Advanced system settings<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8dRe7yI9fAddxrbUW-Y97c-2Sghx6GIPhES6Yp25xXDXgn2fP7KUR6dRhsHw_Bg2vVHgypDZ4y7yBjyDFqU6BoMB_dl5kuBMKa5fiGEDNnW-vHkkz-mygSdjllvpnRzNuwBKIdq44wzw/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8dRe7yI9fAddxrbUW-Y97c-2Sghx6GIPhES6Yp25xXDXgn2fP7KUR6dRhsHw_Bg2vVHgypDZ4y7yBjyDFqU6BoMB_dl5kuBMKa5fiGEDNnW-vHkkz-mygSdjllvpnRzNuwBKIdq44wzw/s1600/Capture.PNG" /></a></div>
<br />
Advanced > Environment Variables<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYVN-7uDiorKZbByvzn0_9qopSL1jpPoONYcZ6MLyvCjJVJ-vOC3Zy3UgabWOHkR7tgq0oqsRK7VPCMroDqTaET_zFAAnMU-ePr8LP8edWkD7ts8P0tJmnYqB3MEwg3L7ZTuAR0pdwUUg/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYVN-7uDiorKZbByvzn0_9qopSL1jpPoONYcZ6MLyvCjJVJ-vOC3Zy3UgabWOHkR7tgq0oqsRK7VPCMroDqTaET_zFAAnMU-ePr8LP8edWkD7ts8P0tJmnYqB3MEwg3L7ZTuAR0pdwUUg/s320/Capture.PNG" width="282" /></a></div>
<br />
Under System variables > New<br />
<ul>
<li>JAVA_HOME - C:\Program Files\Java\jdk1.8.0_101</li>
<li>JRE_HOME - C:\Program Files\Java\jre1.8.0_101</li>
</ul>
<div>
Once Java was able to be successfully referenced, I was still having build errors:</div>
<div>
"Failed to execute goal org.apache.maven.plugins"</div>
<div>
<a href="http://stackoverflow.com/questions/17223536/failed-to-execute-goal-org-apache-maven-pluginsmaven-compiler-plugin2-3-2comp">Stackoverflow.com</a> to the rescue!</div>
<div>
<br /></div>
<div>
I resolved this by updating the Eclipse JRE reference (to the JDK instead):</div>
<div>
Run > Run Configurations...</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1CwscT9UgYgl10BBLiz4fHt52KxTG25luoDYDMG1D7C199opWAKAGjWZuxJ_yLTlc3xDR8Q6duPnKtfV9nM6eWWKhZFFRpQHhfkw5jVwe7syC1uD8xA_thJBDmEQVgFPPiqs8Qei6Ge0/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1CwscT9UgYgl10BBLiz4fHt52KxTG25luoDYDMG1D7C199opWAKAGjWZuxJ_yLTlc3xDR8Q6duPnKtfV9nM6eWWKhZFFRpQHhfkw5jVwe7syC1uD8xA_thJBDmEQVgFPPiqs8Qei6Ge0/s1600/Capture.PNG" /></a></div>
<div>
JRE Tab, click "Installed JREs..." next to Alternate JRE:</div>
<div>
(mine shows "jdk1.8.0_101, which is what I set it to (I.E. it's fixed in the screenshot below).</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgclgwBSFCHuVRwDQqG7oTRD-4hyL_CLmd2pmzHGlHI6z1iLa_tiN6bEP8Tx_30Pf41VE3OWXXf1MoiWx4DDdh8hFIrMkMRjL4JCeqRpCIz68DjQDq8UJGOrjqOCfwd87whuMOLWZpWFJc/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgclgwBSFCHuVRwDQqG7oTRD-4hyL_CLmd2pmzHGlHI6z1iLa_tiN6bEP8Tx_30Pf41VE3OWXXf1MoiWx4DDdh8hFIrMkMRjL4JCeqRpCIz68DjQDq8UJGOrjqOCfwd87whuMOLWZpWFJc/s1600/Capture.PNG" /></a></div>
<div>
From here we click Add:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_EzsD2PCVyVggwToA_Xh3i5f9HhyWSxBqEGNfQQzJngj_8Qhc2QdtjpAkD5N6PyfBgr3K9_cOYBgNeJSPCg5dcD7rn3X3w6kZv0vXrjTU_Qry8Zp7OjDVEAAWLN8C4JpqZRGgWfENse4/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_EzsD2PCVyVggwToA_Xh3i5f9HhyWSxBqEGNfQQzJngj_8Qhc2QdtjpAkD5N6PyfBgr3K9_cOYBgNeJSPCg5dcD7rn3X3w6kZv0vXrjTU_Qry8Zp7OjDVEAAWLN8C4JpqZRGgWfENse4/s1600/Capture.PNG" /></a></div>
<div>
Next</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvQs2xbxOJ4U9HeLmOfBiqi4vUHXWXhBTdG7tE-6DcESgzooXNvM3_XjLr7xEHwCz9QeFOrf9lCL3uxla8qnHQdmreJbXAEvZ_9x00oN1Eu_JMuTfDBZII9GHaWMh7RiCDlOJWvcVBI8U/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvQs2xbxOJ4U9HeLmOfBiqi4vUHXWXhBTdG7tE-6DcESgzooXNvM3_XjLr7xEHwCz9QeFOrf9lCL3uxla8qnHQdmreJbXAEvZ_9x00oN1Eu_JMuTfDBZII9GHaWMh7RiCDlOJWvcVBI8U/s1600/Capture.PNG" /></a></div>
<div>
<br /></div>
<div>
Click Directory and brows to the JDK, mine was here:</div>
<div>
C:\Program Files\Java\jdk1.8.0_101</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcF0352b3BR_2oYiQxc-gQx11CDHAwmhc5iGE00AbFgmfaroT6p68JSXl7LhKAAw09RdPzv2Fg41N0ZCtTFxsq5wCgvAjmsIOrsgNXwHiZlXLgXVXMDFwMXR_-YyuNDsrlKZVYV6UoSI0/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcF0352b3BR_2oYiQxc-gQx11CDHAwmhc5iGE00AbFgmfaroT6p68JSXl7LhKAAw09RdPzv2Fg41N0ZCtTFxsq5wCgvAjmsIOrsgNXwHiZlXLgXVXMDFwMXR_-YyuNDsrlKZVYV6UoSI0/s1600/Capture.PNG" /></a></div>
<br />
Once that was set to the JDK I unchecked the JRE reference under the "Installed JREs" and clicked OK, then close.<br />
<br />
I was then able to successfully complete Run as...>Maven build.<br />
<br />
After setting this all up I noticed the instructor was using the following:<br />
<a href="http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html">Java SE Development Kit 7</a><br />
<a href="http://www.eclipse.org/downloads/packages/eclipse-ide-java-ee-developers/mars2">Eclipse Mars</a><br />
<br />Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com2tag:blogger.com,1999:blog-6818257984018916419.post-23172428569623462182016-09-02T12:37:00.001-07:002016-09-02T12:37:39.299-07:00Hadoop / Amabari integration with Active Directory<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGBK9UgmI-JfxeFGUQlyC7Sw4DHJufm8wUYBXLzviALZZpZx0k0K8bpTaqu2xdetWf_kp04L87fC4P8YULNouOrZhPhVF1LWxUc0BZ20brTOYbKItYdh-G_DyxWtmox94LThuqspe-dqY/s1600/kerberos.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGBK9UgmI-JfxeFGUQlyC7Sw4DHJufm8wUYBXLzviALZZpZx0k0K8bpTaqu2xdetWf_kp04L87fC4P8YULNouOrZhPhVF1LWxUc0BZ20brTOYbKItYdh-G_DyxWtmox94LThuqspe-dqY/s200/kerberos.jpg" width="200" /></a></div>
<br />
One of the things you may want to do with a Hadoop environment is get it integrated with an existing Active Directory. Depending on which distribution you're using, there are several ways to go. My experience has been with Hortonworks and Ambari.<br />
<br />
The documentation I started with, rather un-elegantly, shoves running your own KDC server and using an existing Active Directory into the same set of, choose your own adventure style, documentation:<br />
<a href="https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_Ambari_Security_Guide/content/_installing_and_configuring_the_kdc.html">https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_Ambari_Security_Guide/content/_installing_and_configuring_the_kdc.html</a><br />
<br />
I found this documentation much more useful for my environment:<br />
<a href="https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Enable%20Kerberos%20in%20Ambari%20with%20Existing%20Active%20Directory">https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Enable%20Kerberos%20in%20Ambari%20with%20Existing%20Active%20Directory</a><br />
<br />
It's important to note that, "Before enabling Kerberos in the cluster, you must deploy the Java Cryptography Extension (JCE) security policy files on the Ambari Server and on all hosts in the cluster." -<a href="https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_Ambari_Security_Guide/content/_installing_the_jce.html">reference</a><br />
<br />
I found this JAVA program helpful for validation: <a href="https://jsosic.wordpress.com/tag/java/">https://jsosic.wordpress.com/tag/java/</a><br />
<br />
One of the pieces of information you'll need is your LDAP connection string.<br />
<a href="https://technet.microsoft.com/en-us/library/cc732952%28v=ws.11%29.aspx?f=255&MSPPError=-2147217396">dsquery </a>is a great resource for this.<br />
<br />
For example the following will return what organizational unit and container you're in:<br />
<div>
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">dsquery user -name "[your login]"</span></div>
<div>
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;"><br /></span></div>
Another piece of info you'll need to find is your certificate authority. <a href="https://technet.microsoft.com/en-us/library/cc732443(v=ws.11).aspx">certutil </a>(Windows Command) has you covered here.<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">certutil -config - -ping</span><br />
<br />
(yes "- -ping"), <a href="https://blogs.technet.microsoft.com/pki/2007/05/12/a-simple-way-to-set-the-certutil-config-option/">here's why</a><br />
<br />
Is LDAP running over SSL? Check with <a href="http://windowsitpro.com/security/q-how-can-i-easily-verify-ldap-over-ssl-connectivity-my-windows-dcs">ldp.exe</a><br />
Here's how to <a href="http://windowsitpro.com/active-directory/how-use-ldap-over-ssl-lock-down-ad-traffic">set it up</a>, if it's not.<br />
<br />
You can obtain the certificate information via AD directly (see the IBM article), or by running openssl:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">openssl s_client -connect [Server]:[Port]</span><br />
<br />
You then need to trust the certificate on all the linux hosts<br />
From the IBM article:<br />
<ol>
<li>Create '/etc/pki/ca-trust/source/anchors/activedirectory.pem' and paste the certificate contents</li>
<li>Trust CA cert: sudo update-ca-trust enable; sudo update-ca-trust extract; sudo update-ca-trust check</li>
<li>Trust CA cert in Java:</li>
<li>mycert=/etc/pki/ca-trust/source/anchors/activedirectory.pem sudo keytool -importcert -noprompt -storepass changeit -file ${mycert} -alias ad -keystore /etc/pki/java/cacerts</li>
<li>And at last, please make sure every node on your cluster has access to the ad host.</li>
</ol>
More details on keytool:<br />
<div>
<a href="https://www.sslshopper.com/article-most-common-java-keytool-keystore-commands.html">https://www.sslshopper.com/article-most-common-java-keytool-keystore-commands.html</a></div>
<br />
Once you've got all the pre-requisites done and your configuration items noted down, enabling Kerberos is done via Ambari: Admin -> Kerberos.<br />
<br />
Here's the information you'll need:<br />
<br />
KDC<br />
KDC type: Existing Active Directory<br />
KDC host:<br />
Realm name:<br />
LDAP url:<br />
Container DN:<br />
Domains:<br />
<br />
What you put in here will be mapped to what is in the final krb5.conf files on each server. You can see the details of what Ambaris is going to do by going to the following: Ambari > Kerberos > Configs > Advanced krb5-conf. <br />
<br />
Kadmin (This is required)<br />
Kadmin Host:<br />
Admin principal:<br />
Admin password:<br />
<br />
I did have to change my encryption types to match the certificate as I was getting the following error until I did so:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">kinit: Preauthentication failed while getting initial credentials</span><br />
<br />
You can further configure additional items, like encryption types here:<br />
Ambari > Kerberos > Configs > Advanced kerberos-env. <br />
<br />
I modified the Encryption Types, in Advanced kerberos-env, and un-commented the following in Advanced krb5-conf:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">#default_tgs_enctypes = {{encryption_types}}<br />
#default_tkt_enctypes = {{encryption_types}}</span><br />
<br />
A couple of commands were helpful while troubleshooting:<br />
<br />
In order to identify what encryption types were supported I ran klist against one of the keytabs:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">klist -kte /etc/security/keytabs/kerberos.service_check.082616.keytab</span><br />
<br />
I was also able to manually check what Ambari was attempting to do by running kinit<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">kinit -c [Kerberos 5 cache name] -kt /etc/security/keytabs/kerberos.service_check.082616.keytab HDP_TEST-082616@DOMAIN </span>
<br />
<div>
<br />
During the initial configuration a csv file is produced by Ambari. It contains details regarding hosts, pricipals, and keytabs. Using this as a guide you can do some additional validation.<br />
<ol>
<li>Switch to one of the users that is running a kerberized service</li>
<li>Validate the Kerberose ticket via klist (simply run klist without any switches)</li>
</ol>
Finally I was able to validate via the Ambari front end, under Services > Kerberos > Kerberos Clients.<br />
<br />
You can also see all the users Ambari creates by inspecting AD. I created a new OU and a new account in AD before I got started in order to keep things organized. The new account was used for the Admin principal configuration. I ended up with close to 30 new user accounts in my environment.</div>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com1tag:blogger.com,1999:blog-6818257984018916419.post-18328528965646468472016-08-26T14:21:00.004-07:002016-08-26T14:21:24.472-07:00Ambari Metrics Collector Not Starting (Connection failed: [Errno 111] Connection refused)<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAJB-oJMQLTHW2pfSAe97bdeSB3OWSkxzKhtFZhBoboScxK3cCEI6biKM9SZshcUcoFafoytNbWqdFi3C1bUhUnrHaYIZZV5iAZF2Q9VzYqBsGKiCMqg_harhc6pmFCrGBVDe0yaDfAUk/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAJB-oJMQLTHW2pfSAe97bdeSB3OWSkxzKhtFZhBoboScxK3cCEI6biKM9SZshcUcoFafoytNbWqdFi3C1bUhUnrHaYIZZV5iAZF2Q9VzYqBsGKiCMqg_harhc6pmFCrGBVDe0yaDfAUk/s200/Capture.PNG" width="179" /></a></div>
Last week I had a bit of a trial by fire: <br />
"Here's a 7 node, Hortonworks Hadoop cluster, metrics is broken, fix it, go!"<br />
<br />
The initial indication that metrics was broken was apparent in the Services tab for Ambari Metrics. Here it showed that there was an error and that Metrics Collector was Stopped. The error however wasn't very informative:<br />
<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">Connection failed: [Errno 111] Connection refused...</span><br />
<br />
That didn't tell me much at all, and neither did googling. <br />(I hope the title of this blog helps someone else find this solution quicker.)<br />
<br />
I was able to locate several log files, on the host where Metrics Collector is installed, in the following directory:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">/var/log/ambari-metrics-collector/</span><br />
<br />
Here's a list of the logs I started digging through:<br />
hbase-ams-master-[Server].out<br />
hbase-ams-master-[Server].log<br />
ambari-metrics-collector-startup.out<br />
ambari-metrics-collector.out<br />
ambari-metrics-collector.log<br />
<br />
The ambari-metrics-collector.log was the most informative, and I had errors like the following:<br />
<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect</span><br />
<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">WARN org.apache.hadoop.yarn.webapp.GenericExceptionHandler: INTERNAL_SERVER_ERROR</span><br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;"><br /></span>
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">WARN org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel: Call failed on IOException</span><br />
<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TimelineWebServices: Error saving metrics.</span><br />
<br />
As you can see the errors referenced related components including yarn, hbase, and zookeeper. This sent me down quite the rabbit hole wondering which component was actually having the issue.<br />
<br />
In the end it occurred to me that maybe the collector was getting hung up trying to handle all the data the Metrics Monitors had been gathering while the collector was down. I then decided to try trashing the historical data and resetting the Metrics.<br />
<br />
I used <a href="https://community.hortonworks.com/articles/11805/how-to-solve-ambari-metrics-corrupted-data.html">this article</a> to help me through it.<br />
In short I did the following:<br />
<ol>
<li>First I stopped everything Metrics related, the collector, grafna, and the monitors via Ambari > Service action Stop under Ambari Metrics.</li>
<li>I then backed up and removed, via rename, the following directories:</li>
<ul>
<li>/var/lib/ambari-metrics-collector</li>
<li>/data/var/lib/ambari-metrics-collector/hbase</li>
<li>/var/var/lib/ambari-metrics-collector/hbase-tmp</li>
</ul>
<li>Finally I restarted everything Metrics related via Ambari > Service action Start under Ambari Metrics</li>
</ol>
<div>
Magic!!!</div>
<br />
While troubleshooting this issue I also came across this list of known issues with Ambari Metrics:<br />
<a href="https://cwiki.apache.org/confluence/display/AMBARI/Known+Issues">https://cwiki.apache.org/confluence/display/AMBARI/Known+Issues</a><br />
<br />
Which begged the question, which version am I actually running?<br />
I ran this to find out:<br />
<span style="background-color: #f2f2f2; color: #333333; font-family: "cabin" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">rpm -qa | sort | grep ambari</span><br />
<br />
I may have to look into the NORMALIZATION_ENABLED workaround as a proactive measure.<br />
<br />
Along the way I found a couple neat little tricks:<br />
<ul>
<li>How to check if metrics collector is running from the cli:<br />ambari-metrics-collector status</li>
<li>URL to get some more error details out of Ambari:<br />http://[Server:Port]/api/v1/clusters/[ClusterName]/alerts?fields=*&Alert/state.in(CRITICAL,WARNING)</li>
<li>Default URL for HBase:<br />http://[Server]:61310/master-status</li>
</ul>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com2tag:blogger.com,1999:blog-6818257984018916419.post-63737195834323605382016-08-15T13:36:00.003-07:002016-08-15T13:38:38.893-07:00Hortonworks Sandbox Setup<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzbQCb5-MR0XA_y4FC-Qi9iSRwVVaefZe3Vyftr8WtUO-5NT-xYPevUZ-84fD8aA6ZHkwyatapU_cc4bXZ0kvq2qxQdGDmDw7Gt9lOzsp_zfWNNW952FTK1pgS0tdKEqAH8rgqXnIEfgU/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzbQCb5-MR0XA_y4FC-Qi9iSRwVVaefZe3Vyftr8WtUO-5NT-xYPevUZ-84fD8aA6ZHkwyatapU_cc4bXZ0kvq2qxQdGDmDw7Gt9lOzsp_zfWNNW952FTK1pgS0tdKEqAH8rgqXnIEfgU/s1600/Capture.PNG" /></a></div>
<br />
Last week I decided to get a <a href="http://hortonworks.com/downloads/#sandbox">Hortonworks Sandbox</a> setup. <br />
<br />
There are a lot of options, VMware, VirtualBox, Azure Cloud, Amazon cloud, docker...<br />
<br />
I chose to go with VirtualBox for a couple of reasons:<br />
<ol>
<li>It's first in the list on the hortonworks site</li>
<li>I've never used VirtualBox, and I'd like to give it a spin</li>
<li>VirtualBox comes from Oracle, Java comes from Oracle, and Hadoop seems to be pretty wired up with Java. (might as well get on the train).</li>
</ol>
<div>
The <a href="http://hortonworks.com/wp-content/uploads/2016/02/Import_on_Vbox_3_1_2016.pdf">VirtualBox Install Guide</a> is pretty straight forward:<br />
<ol>
<li><a href="https://www.virtualbox.org/wiki/Downloads">Download and install VirtualBox</a></li>
<li>Download the virtualbox OVA file (this is the compressed file that contains the <a href="https://en.wikipedia.org/wiki/VMDK">vmdk</a> and the <a href="https://en.wikipedia.org/wiki/Open_Virtualization_Format">ovf</a>)</li>
<li>Configure VirtualBox and import the image</li>
</ol>
<div>
Getting VirtualBox downloaded and installed was simple enough. However my immediate attempt to download and import the OVA file was beset with issues.</div>
<div>
<br /></div>
<div>
Click download image from the website</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIX8tY3lowAQrkldMwOR4q0Q6Qu1T-jwzC810XrM1BQcsX_WDV19sfwzbM4nZoJi1VrTXx0hKQD7gCz1Jte5Vn_ZHfw-6y2uDA-fRf7mYq51wabKOH91sVMLnBgNhKlSOvn2JG2aizzP8/s1600/Capture.PNG" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="68" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIX8tY3lowAQrkldMwOR4q0Q6Qu1T-jwzC810XrM1BQcsX_WDV19sfwzbM4nZoJi1VrTXx0hKQD7gCz1Jte5Vn_ZHfw-6y2uDA-fRf7mYq51wabKOH91sVMLnBgNhKlSOvn2JG2aizzP8/s640/Capture.PNG" width="640" /></a></div>
<div>
<br /></div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiqee7PtHvThPGGF7cTqz1DXGE5qFsZUkLk5t_fxjMd3y867-3NVzVDPcSVI5aID8z1rieZyISGEnsIArpwuSGGmEsgo95PcBsA10u8DH5Rn2KrjmU36VlbR9O5pqZjqg_q26aGHZtsUA/s1600/Capture.PNG" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiqee7PtHvThPGGF7cTqz1DXGE5qFsZUkLk5t_fxjMd3y867-3NVzVDPcSVI5aID8z1rieZyISGEnsIArpwuSGGmEsgo95PcBsA10u8DH5Rn2KrjmU36VlbR9O5pqZjqg_q26aGHZtsUA/s1600/Capture.PNG" /></a></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
Wait 25 minutes...</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeoVlGHdWTYQfVKPyrgH4Nyx_2-u1xzFg-amPZZzQqyrVasF2jeWkhYDqfuy7ckxwTWXex_fanMFjexQGGnX1ZqSBavg5y5oepJHyphN6IwhDbFT9as6p_mjnajJfeJR5bK7Uw27Na8wI/s1600/Capture.PNG" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeoVlGHdWTYQfVKPyrgH4Nyx_2-u1xzFg-amPZZzQqyrVasF2jeWkhYDqfuy7ckxwTWXex_fanMFjexQGGnX1ZqSBavg5y5oepJHyphN6IwhDbFT9as6p_mjnajJfeJR5bK7Uw27Na8wI/s1600/Capture.PNG" /></a></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
Run the VirtualBox import, or so I thought.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAfFBe0TTIpsKy4_M-z84-oP9q3kW2q6lbfP4LwtZTtiPQjOjBpgi69Yjh4sGM0cpJrjh-x5E6tckAV2kHMjyenv3jW9mIfNxl0IThYUSc9GgCTTqADbh6HTjYE-EPSg5hy1meaalkZQM/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAfFBe0TTIpsKy4_M-z84-oP9q3kW2q6lbfP4LwtZTtiPQjOjBpgi69Yjh4sGM0cpJrjh-x5E6tckAV2kHMjyenv3jW9mIfNxl0IThYUSc9GgCTTqADbh6HTjYE-EPSg5hy1meaalkZQM/s640/Capture.PNG" width="640" /></a></div>
<div>
<br /></div>
<div>
Maybe I can extract the files and just try importing the vmdk directly?</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_hips-Rr4TuBO7J3Xv-NnjfNnQn1dWNhFV345h7bklsXSElo1u8CbvrijhKWkpXBOxadyiycUfKHmql0KPShGFOgRsTIV9owzn6Dtzoq3_M3vHY262wvoRrIVT0tVFfuiTCj4FDW5fUU/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_hips-Rr4TuBO7J3Xv-NnjfNnQn1dWNhFV345h7bklsXSElo1u8CbvrijhKWkpXBOxadyiycUfKHmql0KPShGFOgRsTIV9owzn6Dtzoq3_M3vHY262wvoRrIVT0tVFfuiTCj4FDW5fUU/s1600/Capture.PNG" /></a></div>
<div>
Fail!</div>
<div>
<br /></div>
<div>
Did you catch what's going on? My download is supposed to be 8.5 GB, but I only got 1.3 of that. Googling for my issue didn't provide much help, other than several other people searching for a solution to the same issue.</div>
<div>
<br /></div>
<div>
I did find some interesting things when googling how to resume a chrome download. Most of it points to using <a href="https://en.wikipedia.org/wiki/Wget">wget</a>, so here's how to do that:</div>
<div>
<ol>
<li><a href="https://eternallybored.org/misc/wget/">Download wget</a></li>
<li>Extract the file (no install required)</li>
<li>Run the command, here's mine:</li>
</ol>
<div>
<span style="font-size: x-small;">wget --tries=10 --show-progress "https://d1zjfrpe8p9yc0.cloudfront.net/hdp-2.4/HDP_2.4_virtualbox_v3.ova" -O "D:\Hadoop\HDP_2.4_virtualbox_v3.ova"</span></div>
</div>
<div>
<br />
(<a href="http://www.thegeekstuff.com/2009/09/the-ultimate-wget-download-guide-with-15-awesome-examples">this</a> was helpful for putting the command together)<br />
<br /></div>
<div>
--tries tells wget to keep retrying / resuming the download.</div>
<div>
--show-progress is self explanatory</div>
<div>
The URL to download, I pulled from the Hortonworks website:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFbKVU-4P2rshyphenhyphenSUr2xI-rr-rZbrTHzU55NVFhlCHpgJ5dABcZF5pSqKW03f1XUQsumSESg6ZeFBW-h7oAv3Db0MsE7Jjo2BNBxuWKLDIsXSLcJ7rXBbu9oRgsV2JS8n07857y95crVys/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFbKVU-4P2rshyphenhyphenSUr2xI-rr-rZbrTHzU55NVFhlCHpgJ5dABcZF5pSqKW03f1XUQsumSESg6ZeFBW-h7oAv3Db0MsE7Jjo2BNBxuWKLDIsXSLcJ7rXBbu9oRgsV2JS8n07857y95crVys/s1600/Capture.PNG" /></a></div>
<div>
-O is the output location.</div>
<div>
<br /></div>
<div>
This got me a happy file. You'll see multiple disconnects and resumes while (w)getting it:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtDNZaR6GYazf5j1j1wCEloyB2kcMsk7oM_6ADfaDs7p7bkk2ZiY_GxJ660S45Krs4u5nuXjyNJnBj6WpRvg8dmxtuhF6KyMET8SWqE1T4wdliAzyciWSJoKo_cWxwsaDLi_Xs0Soz-KY/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="336" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtDNZaR6GYazf5j1j1wCEloyB2kcMsk7oM_6ADfaDs7p7bkk2ZiY_GxJ660S45Krs4u5nuXjyNJnBj6WpRvg8dmxtuhF6KyMET8SWqE1T4wdliAzyciWSJoKo_cWxwsaDLi_Xs0Soz-KY/s640/Capture.PNG" width="640" /></a></div>
<div>
<br /></div>
<div>
I was now able to import the VirtualBox:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCog4I0PTi9FRh90yl8eAf7rInzuerIRLzWK1_B7Py92ik34OaMT08qKWTadg33cBxRDDlvm8XmgT9eihN_VOessDRoRnpDGvwqS9mgYnMmonSNtlHMk2UpHs199Ata8hz6NREeltAXgE/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="478" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCog4I0PTi9FRh90yl8eAf7rInzuerIRLzWK1_B7Py92ik34OaMT08qKWTadg33cBxRDDlvm8XmgT9eihN_VOessDRoRnpDGvwqS9mgYnMmonSNtlHMk2UpHs199Ata8hz6NREeltAXgE/s640/Capture.PNG" width="640" /></a></div>
<div>
And start it up without issue?, nope:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTPeVJMeSzJUz1fIn30O6HWc5pOEuUFgiGDcqlVnQde_31IHA7F4TNlGXh3M4XtT2eJbXhfw7h9H6LLC_CUGgUG9JVgZc6E0T_SGPGkMF_eTZmkrDsvJuy7llX_tMQg6XFFUP0iyaNTcI/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="481" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTPeVJMeSzJUz1fIn30O6HWc5pOEuUFgiGDcqlVnQde_31IHA7F4TNlGXh3M4XtT2eJbXhfw7h9H6LLC_CUGgUG9JVgZc6E0T_SGPGkMF_eTZmkrDsvJuy7llX_tMQg6XFFUP0iyaNTcI/s640/Capture.PNG" width="640" /></a></div>
<br />
Basically VirtualBox doesn't like running a VM on a VM.<br />
<br />
So lets try running it locally?, nope:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdePC59ujvcb3DdZdr19D8WKM7RaWUsVihD3BgMtYbqxpt-JYN4Z2-KMzgkrqZ1FpkiWBSoJChBVbqGn4cI7qU7ZjcNOjxu88kYoRdk0KnPd4AG_TJx6mg9LmNcico8qEp2BBC3Vssn-c/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="478" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdePC59ujvcb3DdZdr19D8WKM7RaWUsVihD3BgMtYbqxpt-JYN4Z2-KMzgkrqZ1FpkiWBSoJChBVbqGn4cI7qU7ZjcNOjxu88kYoRdk0KnPd4AG_TJx6mg9LmNcico8qEp2BBC3Vssn-c/s640/Capture.PNG" width="640" /></a></div>
<br />
Enable VT-x in the BIOS.<br />
Set VTx to Enabled, F10.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-TdIhImE1laOFJS5kZyPCzoFiwulUk651lRHLCki4d2JrlfAQKrhFONARRaEB82JBdQIXp1LREiZgHYStBt9rGRV6oaOPmFFTmQOOJ_Vau-NLWev0mfa0vG-VVIi3_cHupmdNofId6I4/s1600/20160811_112005.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-TdIhImE1laOFJS5kZyPCzoFiwulUk651lRHLCki4d2JrlfAQKrhFONARRaEB82JBdQIXp1LREiZgHYStBt9rGRV6oaOPmFFTmQOOJ_Vau-NLWev0mfa0vG-VVIi3_cHupmdNofId6I4/s640/20160811_112005.jpg" width="640" /></a></div>
<br />
Ahh, there we go (finally!)<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtlZNI5iIwKXx8Wc3qAvKr4hDOLBZQC-1tOi9nVjEVPucq33L4XZtAvDePtHl5CVIpTHgM_sky7H1x15ypxJI7OFqzf-CP-NINt1nUD5VGNq9KqMvsDhfeUWC76s1NVRSSYihBzbxf27Q/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="396" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtlZNI5iIwKXx8Wc3qAvKr4hDOLBZQC-1tOi9nVjEVPucq33L4XZtAvDePtHl5CVIpTHgM_sky7H1x15ypxJI7OFqzf-CP-NINt1nUD5VGNq9KqMvsDhfeUWC76s1NVRSSYihBzbxf27Q/s640/Capture.PNG" width="640" /></a></div>
<br />
Lessons learned:<br />
<br />
Use wget for downloads that repeatedly drop the connection.<br />
Don't try to run VirtualBox inside a VM.<br />
Enable VTx in the BIOS to run VirtualBox.<br />
<br />
P.S. I also tried to be clever with VirtualBox inside a VM, but none of the following worked:<br />
<br />
Turn off the hardware marriage?<br />
..\VirtualBox>VBoxManage modifyvm "Hortonworks Sandbox with HDP 2.4" --hwvirtex off<br />
C:\Program Files\Oracle\VirtualBox>VBoxManage modifyvm "Hortonworks Sandbox with HDP 2.4" --vtxvpid off<br />
<br />
..\VirtualBox>VBoxManage startvm "Hortonworks Sandbox with HDP 2.4"<br />
Waiting for VM "Hortonworks Sandbox with HDP 2.4" to power on...<br />
VM "Hortonworks Sandbox with HDP 2.4" has been successfully started.<br />
<br />
Except, not:<br />
Effective Paravirt. Provider: None<br />
"Your 64-bit guest will fail to detect a 64-bit cpu and will not be able to boot"<br />
<br />
The <a href="https://forums.virtualbox.org/viewtopic.php?f=6&t=39216">VirtualBox forums</a> were very <a href="https://forums.virtualbox.org/viewtopic.php?f=6&t=39216">helpful</a>.</div>
</div>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com1tag:blogger.com,1999:blog-6818257984018916419.post-5998952567180314642016-08-08T11:58:00.002-07:002016-08-08T15:07:11.316-07:00Getting started with Hadoop<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyG7C9OW2_PCmy9rm1MZ5fWLMOkeM8bO5sBW36AeVVkbY61TzAGG9XJR_q-XG2G4b4-zET6O_Il_gMeekwomfG61Df4EXj6bsJSfBBSKxiuUWZv-V6kllZIX7rPLLW4ehcG-__AvDs3vc/s1600/intro-duction-to-Hadoop-full.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="237" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyG7C9OW2_PCmy9rm1MZ5fWLMOkeM8bO5sBW36AeVVkbY61TzAGG9XJR_q-XG2G4b4-zET6O_Il_gMeekwomfG61Df4EXj6bsJSfBBSKxiuUWZv-V6kllZIX7rPLLW4ehcG-__AvDs3vc/s320/intro-duction-to-Hadoop-full.jpg" width="320" /></a></div>
<br />
This past week I dove into the hadoop pool. It's certainly overwhelming at first. Have you seen the list of components?<br />
<br />
Avro<br />
Flume<br />
HBase<br />
HDFS<br />
Hive<br />
Hue<br />
Impala<br />
Mahout<br />
Map Reduce<br />
Oozie<br />
Pig<br />
Spark<br />
Sqoop<br />
YARN<br />
Zoo Keeper<br />
etc.<br />
<br />
In addition to all the components there are multiple distributions to choose from:<br />
Coudera - CDH<br />
Hortonworks<br />
MapR<br />
Roll your own<br />
<br />
...and each of these distributions have various editions.<br />
<br />
The Edureka resources were very helpful for understanding how all of this comes together:<br />
<a href="http://www.edureka.co/blog/essential-hadoop-tools-for-big-data">http://www.edureka.co/blog/essential-hadoop-tools-for-big-data</a><br />
<br />
The following screenshots came from this video:<br />
<a href="https://www.youtube.com/watch?v=zjdN3IxUh6A">https://www.youtube.com/watch?v=zjdN3IxUh6A</a><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgERzV2-Q-FaRrtSIGArNTwF8bYJh6dyNPjAkaadC2eBrW_1comUj_BBaKcC-teu9fuhjHZPW7hNi9kYUuHs0-QuAE4kKB3dvuW_zy4mjeEMcxntOmrhFUEzVHHKfkG2Q-nhDiyGHU0aNM/s1600/01Cloudera.CDH.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="316" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgERzV2-Q-FaRrtSIGArNTwF8bYJh6dyNPjAkaadC2eBrW_1comUj_BBaKcC-teu9fuhjHZPW7hNi9kYUuHs0-QuAE4kKB3dvuW_zy4mjeEMcxntOmrhFUEzVHHKfkG2Q-nhDiyGHU0aNM/s640/01Cloudera.CDH.PNG" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgM871oG3Rn1mw9PMpqfoQ0jEzO-xti0nAymwqpbrWH4zPWtTnF9vUSeswSxyYFM-y8OHIgkPQoqbuzsTOzFbpcpzGpWMHwmJUGlW1MPkdgNnqRh0ZIm56E0mVJuIMWYk9NHm9wJH-ghFE/s1600/03Hortonworks.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="318" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgM871oG3Rn1mw9PMpqfoQ0jEzO-xti0nAymwqpbrWH4zPWtTnF9vUSeswSxyYFM-y8OHIgkPQoqbuzsTOzFbpcpzGpWMHwmJUGlW1MPkdgNnqRh0ZIm56E0mVJuIMWYk9NHm9wJH-ghFE/s640/03Hortonworks.PNG" width="640" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhaBou1npaDmbXCb6q6hzYdfIyew5KAY5g-Z6voTtPDlK6YRSKo4JVRYzv21KAqDMjj_YCJRWdFzVh5Rf1fsRQilQvSGMGxZJ-iflFMdqww0AhyhnmNSLToEQRVMQCi3gaazV7Ja8FRqqc/s1600/06MapR.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="330" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhaBou1npaDmbXCb6q6hzYdfIyew5KAY5g-Z6voTtPDlK6YRSKo4JVRYzv21KAqDMjj_YCJRWdFzVh5Rf1fsRQilQvSGMGxZJ-iflFMdqww0AhyhnmNSLToEQRVMQCi3gaazV7Ja8FRqqc/s640/06MapR.PNG" width="640" /></a></div>
<br />
<br />
<br />
One of the first things I did was get this book, and start reading:<br />
<iframe frameborder="0" marginheight="0" marginwidth="0" scrolling="no" src="//ws-na.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&OneJS=1&Operation=GetAdHtml&MarketPlace=US&source=ss&ref=as_ss_li_til&ad_type=product_link&tracking_id=jonmorisissql-20&marketplace=amazon&region=US&placement=1491901632&asins=1491901632&linkId=7ff64ca628f1c7c733ca3e1a4edf425c&show_border=true&link_opens_in_new_window=true" style="height: 240px; width: 120px;"></iframe><br />
<br />
I also started watching videos:<br />
<br />
<a href="https://youtu.be/zjdN3IxUh6A?list=PLZOnyWT8_Q0uWiG-A8rc16jYdKzw-zs_3">Apache VS Cloudera VS MapR VS Hortonworks : Which Hadoop Distribution To Use?</a><br />
<a href="https://youtu.be/IvtyruO4dN4?list=PLZOnyWT8_Q0uWiG-A8rc16jYdKzw-zs_3">Hadoop Distributions - Cloudera vs Hortonworks vs MapR vs Intel</a><br />
<a href="https://youtu.be/xYnS9PQRXTg?list=PLZOnyWT8_Q0uWiG-A8rc16jYdKzw-zs_3">Hadoop - Just the Basics for Big Data Rookies</a><br />
<br />
...and when I was ready to get hands on (immediately)., this udemy Hadoop starter kit was great!<br />
<a href="https://www.udemy.com/hadoopstarterkit/learn/v4/overview">Hadoop Starter Kit</a><br />
The instructor walks you through the basics and they provide you with credentials to get into and start running commands on a Cloudera - CDH system.<br />
<br />
From here I think I'll start playing around on a sandbox. Each of the distributions offers a way to spin up a VM or log into a cloud based environment. There are also <a href="https://hub.docker.com/explore/">docker images</a> out there. (search hadoop, cloudera, or hortonworks). Most of these docker images look fairly new, so don't cut yourself.<br />
<br />
I'm looking at the HortonWorks distro, so I'll probably setup a <a href="http://hortonworks.com/downloads/#sandbox">Hortonworks Sandbox</a>.<br />
<br />
One last note. I also setup a couple of rss feeds, via the <a href="https://community.hortonworks.com/topics/rss.html">Hortonworks Community Connection</a>, as a way to keep a pulse on what the experts are talking about. <br />
<br />
If anyone else has a good list of hadoop or hortonworks, perhaps individual component feeds, I'm interested.<br />
<br />
There's also this mailing list: <a href="https://hadoop.apache.org/mailing_lists.html">https://hadoop.apache.org/mailing_lists.html</a><br />
<br />
Thanks for reading. I hope you find the links and videos useful.Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-39758852762431439132016-07-12T15:07:00.002-07:002016-07-12T15:07:48.205-07:00SQL Server 2014 Service Pack 2<img alt="TSQL2SDAY-150x150" src="https://chrisyatessql.files.wordpress.com/2016/07/tsql2sday-150x150.png?w=700" /><br />
This months T-SQL Tuesday topic is: "treat yourself to a birthday gift".<br />
<br />
As it so happens Microsoft provided us an excellent gift in the form of <a href="https://support.microsoft.com/en-us/kb/3171021">SQL Server 2014 Service Pack 2</a>. Included are many fixes and improvements. I saw the individual KB RSS feed posts coming through just prior to the SP2 release. It looked like there were some good improvements in there, so I was particularly interested in looking through those.<br />
<br />
Before I get into the distilled list of improvements I do want to point out this fix because it looks like a pretty big deal. It's also listed last in the release notes, so there's a good chance some people may overlook it.<br />
<span style="font-size: medium;">
TYPE: Restore
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6589587
<tr><td><b> KB article number </b></td><td> 3065060
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3065060">FIX: "Unable to create restore plan due to break in the LSN chain" error when you restore differential backup in SSMS</a>
<tr><td><b> Notes </b></td><td> "When you restore the full backup file by using the NORECOVERY option, and then you restore the differential backup file by using the RECOVERY option in SSMS, the operation fails"
</table>
</span>
<br />
On to the Improvements...
There's some good stuff in here UTF-8 support, CLONEDATABASE, additional tempdb logging, additional events for availability groups, and change tracking manual cleanup for example.
<br />
<span style="font-size: medium;">
TYPE: BULK / bcp
<table border=1 width="100%">
<col width="100">
<tbody>
<tr><td><b> VSTS bug number </b></td><td>6715815 </td></tr>
<tr><td><b> KB article number </b></td><td>3136780 </td></tr>
<tr><td><b> Description </b></td><td><a href="https://support.microsoft.com/en-us/kb/3136780">UTF-8 encoding support for the BCP utility and BULK INSERT Transact-SQL command in SQL Server 2014 SP2</a> </td></tr>
<tr><td><b> Notes </b></td><td>"Import support is added to the BCP utility and to the BULK INSERT
Export support is added to the BCP utility
BULK INSERT WITH (CODEPAGE = '65001', DATAFILETYPE = 'Char')
bcp… -C 65001" </td></tr>
</tbody></table>
<br />
TYPE: DBCC
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6935655 </td></tr>
<tr><td><b> KB article number </b></td><td> 3177838 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3177838">DBCC CLONEDATABASE (Transact-SQL) is introduced in SQL Server 2014 Service Pack 2</a> </td></tr>
<tr><td><b> Notes </b></td><td> "DBCC CLONEDATABASE (source_database_name, target_database_name)
Generate a schema and statistics only copy.
"DBCC CLONEDATABASE isn't supported to be used as a production database and is primarily intended for troubleshooting and diagnostic purposes" </td></tr>
</table>
<br />
TYPE: Dynamic Management Function
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 5990425 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170114 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170114">Update to add DMF sys.dm_db_incremental_stats_properties in SQL Server 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> Sys.dm_db_incremental_stats_properties</td></tr>
</table>
<br />
TYPE: Dynamic Management View
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6588995 </td></tr>
<tr><td><b> KB article number </b></td><td> 3107398 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3107398">Improved memory grant diagnostics when you use DMV in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> Adds new (grant/parallelism) colums to sys.dm_exec_query_stats </td></tr>
</table>
<br />
TYPE: Extended Events
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6589007 </td></tr>
<tr><td><b> KB article number </b></td><td> 3107172 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3107172">Improve tempdb spill diagnostics by using Extended Events in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> New extended event: hash_spill_details </td></tr>
</table>
<br />
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6589008 </td></tr>
<tr><td><b> KB article number </b></td><td> 3107173 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3107173">Improved memory grant diagnostics using Extended Events in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td>New extended event: query_memory_grant_usage. <br />
"You can define a memory limit as a filter for this new extended event so that the extended event fires only when the memory grant of a query exceeds the limit." </td></tr>
</table>
<br />
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6685025, 3512277 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170113 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170113">Update to expose per-operator query execution statistics in showplan XML and Extended Event in SQL Server 2014 SP2</a> </td></tr>
<tr><td><b> Notes </b></td><td> "New Extended Event: query_thread_profile
Provides per-operator query execution statistics" </td></tr>
</table>
<br />
TYPE: Extended Event / Performance Monitor
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6988068 </td></tr>
<tr><td><b> KB article number </b></td><td> 3173156 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3173156">Update adds AlwaysOn extended events and performance counters in SQL Server 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> Adds and changes multiple Extended Events and performance counters for Availibility Groups. </td></tr>
</table>
<br />
TYPE: Extended Event / SQL Server Error Log
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6589061, 6589062, 6589063, 6589064 </td></tr>
<tr><td><b> KB article number </b></td><td> 3112363 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3112363">Improvements for SQL Server AlwaysOn Lease Timeout supportability in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td>The availability_group_lease_expired and hadr_ag_lease_renewal XEvents have been improved (Lease stages detail) <br />
New SQL Server Error Messages: 19419-19424 </td></tr>
</table>
<br />
TYPE: LOG: Cluster log
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6651348 </td></tr>
<tr><td><b> KB article number </b></td><td> 3156304 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3156304">Improvement for SQL Server AlwaysOn Lease Timeout supportability in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> "System performance objects are reported in the Cluster log following a lease timeout:
Date/Time, Processor time(%), Available memory(bytes), Avg disk read(secs), Avg disk write(secs)" </td></tr>
</table>
<br />
TYPE: LOG: SQL Server error logs
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6944985 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170019 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170019">Update to add info about SQL Server startup account to security policy in SQL Server 2014 error log</a> </td></tr>
<tr><td><b> Notes </b></td><td> Information about whether Database Instant File Initalization is enabled is now in the error log </td></tr>
</table>
<br />
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6945027 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170020 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170020">Informational messages added for tempdb configuration in the SQL Server error log in SQL Server 2014 SP2</a> </td></tr>
<tr><td><b> Notes </b></td><td> "New informational startup messages:<br />
The tempdb database has %ld data file(s).<br />
The tempdb database data files are not configured to use the same initial size and autogrowth settings…" </td></tr>
</table>
<br />
TYPE: Memory
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6936784 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170022 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170022">Update lets SQL Server 2014 use all the user-mode virtual address space for a process</a> </td></tr>
<tr><td><b> Notes </b></td><td> "Starting in Windows 8.1 and Windows Server 2012 R2, the user-mode virtual address space for each 64-bit process in 64-bit Windows is 128 TB" </td></tr>
</table>
<br />
TYPE: Query Hint
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6588981 </td></tr>
<tr><td><b> KB article number </b></td><td> 3107401 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3107401">New query memory grant options are available (min_grant_percent and max_grant_percent) in SQL Server 2012</a> </td></tr>
<tr><td><b> Notes </b></td><td> OPTION (min_grant_percent = x, max_grant_percent = y) </td></tr>
</table>
<br />
TYPE: Showplan XML
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6589011 </td></tr>
<tr><td><b> KB article number </b></td><td> 3107397 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3107397">Improved diagnostics for query execution plans that involve residual predicate pushdown in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> New showplan XML attribute: Actual Rows Read </td></tr>
</table>
<br />
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6589060 </td></tr>
<tr><td><b> KB article number </b></td><td> 3107400 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3107400">Improved tempdb spill diagnostics in Showplan XML schema in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> "SortSpillDetails and HashSpillDetails
Warning section of the tooltip for the Sort operation or the Hash operation in the graphical execution plan output" </td></tr>
</table>
<br />
TYPE: Dynamic Management Function
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6170317 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170112 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170112">Update to expose maximum memory enabled for a single query in Showplan XML in SQL Server 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> New Attributes: OptimizerHardwareDependentProperties\MaxCompileMemory and MemoryGrantInfo\MaxQueryMemory </td></tr>
</table>
<br />
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6170324 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170115 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170115">Information about enabled trace flags is added to the showplan XML in SQL Server 2014 SP2</a> </td></tr>
<tr><td><b> Notes </b></td><td> Includes the scope details of the enabled trace flags </td></tr>
</table>
<br />
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 2197153 </td></tr>
<tr><td><b> KB article number </b></td><td> 3172997 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3172997">Update to add memory grant warning to the Showplan XML in SQL Server 2014 SP2</a> </td></tr>
<tr><td><b> Notes </b></td><td> New warning: "MemoryGrantWarning" </td></tr>
</table>
<br />
TYPE: Spatial Data
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 6588999 </td></tr>
<tr><td><b> KB article number </b></td><td> 3107399 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3107399">Spatial performance improvements in SQL Server 2012 and 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> Trace flag 6533<br />
Spatial data type performance improvement (no details). <br />Don't use this if you use the STRelate and/or STAsBinary functions." </td></tr>
</table>
<br />
TYPE: Stored Procedure
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 4297600 </td></tr>
<tr><td><b> KB article number </b></td><td> 3170123 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3170123">Supports DROP TABLE DDL for articles that are included in transactional replication in SQL Server 2014</a> </td></tr>
<tr><td><b> Notes </b></td><td> Exec sp_changepublication @publication = '<Publication Name>', @property = 'allow_drop', @value = 'true'<br />
Exec sp_addpublication @publication = '<Publication Name>', ..., @allow_drop = N'true'<br />
"A table can be dropped only if the allow_drop property is set to TRUE on all the publications"<br />
Will this drop the table on the subscriber? (The article doesn't go into those details.) </td></tr>
</table>
<br />
<table border=1 width="100%">
<col width="100">
<tr><td><b> VSTS bug number </b></td><td> 7579503 </td></tr>
<tr><td><b> KB article number </b></td><td> 3173157 </td></tr>
<tr><td><b> Description </b></td><td> <A HREF="https://support.microsoft.com/en-us/kb/3173157">Adds a stored procedure for the manual cleanup of the change tracking side table in SQL Server 2014 SP2</a> </td></tr>
<tr><td><b> Notes </b></td><td> Sp_flush_CT_internal_table_on_demand [ @TableToClean= ] 'TableName'<br />
Addresses this: <a href="http://sirsql.net/content/2014/04/03/201443change-tracking-cleanup-limitation/">Change Tracking Cleanup Limitation</a></td></tr>
</table>
</span>Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-48755627319244402952016-07-07T09:11:00.000-07:002016-07-07T09:11:04.076-07:00 CAST as numeric rounding issue (Oracle Number to SQL Server numeric)In working on a migration project, I discovered two different versions of SQL rounding and/or truncating implicit conversions from an Oracle NUMBER data type to a SQL Server numeric data type differently.<br />
<br />
Here's my test case:<br />
<br />
ENVIRONMENT:<br />
<br />
Two servers, one running Microsoft SQL Server 2014 and another running SQL Server 2008 R2.<br />
<br />
<a href="http://jonmorisissqlblog.blogspot.com/2016/05/oracle-linked-server-setup.html">Setup an Oracle 11g Linked Server</a>. Both servers are running the OraOLEDB.Oracle Provider with the same options and driver version (<a href="http://www.oracle.com/technetwork/database/enterprise-edition/downloads/112010-win64soft-094461.html">11.02.00.01</a>).<br />
<br />
Setup a test database in ORACLE:<br />
<br />
<span style="font-family: "courier new"; font-size: x-small;">
<span style="color: blue;">CREATE</span> <span style="color: blue;">TABLE</span> <span style="color: maroon;">[MySchema]</span><span style="color: silver;">.</span><span style="color: maroon;">number_test</span>
<br /> <span style="color: maroon;">(</span>
<br /> <span style="color: maroon;">UNDEFINED </span><span style="color: black;"><i>NUMBER</i></span>
<br /> <span style="color: maroon;">)</span>
</span>
<br />
<br />
Throw some test values in there:<br />
<br />
<span style="font-family: "courier new"; font-size: x-small;">
<span style="color: blue;">INSERT</span> <span style="color: blue;">INTO</span> <span style="color: maroon;">[MySchema]</span><span style="color: silver;">.</span><span style="color: maroon;">number_test</span>
<span style="color: blue;">VALUES</span> <span style="color: maroon;">(</span><span style="color: silver;">-</span><span style="color: black;">98.786</span><span style="color: maroon;">)</span><br /><span style="color: blue;">INSERT</span> <span style="color: blue;">INTO</span> <span style="color: maroon;">[MySchema]</span><span style="color: silver;">.</span><span style="color: maroon;">number_test</span> <span style="color: blue;">VALUES</span> <span style="color: maroon;">(</span><span style="color: silver;">-</span><span style="color: black;">98.785</span><span style="color: maroon;">)</span>
<br /><span style="color: blue;">INSERT</span> <span style="color: blue;">INTO</span> <span style="color: maroon;">[MySchema]</span><span style="color: silver;">.</span><span style="color: maroon;">number_test</span> <span style="color: blue;">VALUES</span> <span style="color: maroon;">(</span><span style="color: silver;">-</span><span style="color: black;">98.784</span><span style="color: maroon;">)</span>
<br /><span style="color: blue;">INSERT</span> <span style="color: blue;">INTO</span> <span style="color: maroon;">[MySchema]</span><span style="color: silver;">.</span><span style="color: maroon;">number_test</span> <span style="color: blue;">VALUES</span> <span style="color: maroon;">(</span><span style="color: black;">98.784</span><span style="color: maroon;">)</span>
<br /><span style="color: blue;">INSERT</span> <span style="color: blue;">INTO</span> <span style="color: maroon;">[MySchema]</span><span style="color: silver;">.</span><span style="color: maroon;">number_test</span> <span style="color: blue;">VALUES</span> <span style="color: maroon;">(</span><span style="color: black;">98.785</span><span style="color: maroon;">)</span>
<br /><span style="color: blue;">INSERT</span> <span style="color: blue;">INTO</span> <span style="color: maroon;">[MySchema]</span><span style="color: silver;">.</span><span style="color: maroon;">number_test</span> <span style="color: blue;">VALUES</span> <span style="color: maroon;">(</span><span style="color: black;">98.786</span><span style="color: maroon;">)</span> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">COMMIT</span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
THE TEST:<br />
<br />
Run the following SQL from both SQL Server 2014 and SQL Server 2008 R2:<br />
<br />
<span style="font-family: "courier new"; font-size: x-small;">
<span style="color: blue;">SELECT</span> <span style="color: maroon;">undefined</span><span style="color: silver;">,</span>
<br /> <span style="color: magenta;"><i>Cast</i></span><span style="color: maroon;">(</span><span style="color: maroon;">undefined</span> <span style="color: blue;">AS</span> <span style="color: black;"><i>NUMERIC</i></span><span style="color: maroon;">(</span><span style="color: black;">20</span><span style="color: silver;">,</span> <span style="color: black;">2</span><span style="color: maroon;">)</span><span style="color: maroon;">)</span> <span style="color: blue;">AS </span> <span style="color: maroon;">IConvert</span><br /><span style="color: blue;">FROM</span> <span style="color: magenta;"><i>Openquery</i></span><span style="color: maroon;">(</span><span style="color: maroon;">[mylinkedserver]</span><span style="color: silver;">,</span> <span style="color: red;">'select * from [MySchema].NUMBER_TEST'</span><span style="color: maroon;">)</span>
<br /> <span style="color: maroon;">Numbers</span> </span><br />
<br />
RESULTS:<br />
<br />
SSMS 2014:<br />
<table border="1">
<tbody>
<tr><td><b> undefined </b></td><td><b> IConvert </b></td></tr>
<tr><td>-98.786 </td><td>-98.79 </td></tr>
<tr><td>-98.785 </td><td><span style="background-color: yellow;">-98.78 </span></td></tr>
<tr><td>-98.784 </td><td>-98.78</td></tr>
<tr><td>98.784 </td><td>98.78 </td></tr>
<tr><td>98.785 </td><td><span style="background-color: yellow;">98.78 </span></td></tr>
<tr><td>98.786 </td><td>98.79 </td></tr>
</tbody></table>
<br />
SSMS 2008 R2:<br />
<table border="1">
<tbody>
<tr><td><b> undefined </b></td><td><b> IConvert </b></td></tr>
<tr><td>-98.786 </td><td>-98.79 </td></tr>
<tr><td>-98.785 </td><td><span style="background-color: yellow;">-98.79 </span></td></tr>
<tr><td>-98.784 </td><td>-98.78 </td></tr>
<tr><td>98.784 </td><td>98.78 </td></tr>
<tr><td>98.785 </td><td><span style="background-color: yellow;">98.79 </span></td></tr>
<tr><td>98.786 </td><td>98.79 </td></tr>
</tbody></table>
<br />
<div>
FINDINGS:</div>
<div>
<br /></div>
<div>
I'm not sure what's going on here with the implicit conversion from Oracle NUMBER to SQL Server numeric, particularly why the behavior is different between SQL Server 2014 and SQL Server 2008 R2. Did Microsoft make a decision to change the rounding behavior? I did test the generic ROUND() function which has consistent results in all 3 environments:</div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br />SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Round</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">-</span><span style="font-family: "courier new"; font-size: x-small;">98.785</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;">2</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div>
SQL Servers:</div>
<div>
<span style="font-family: "courier new"; font-size: x-small;">-98.790</span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div>
Oracle:</div>
<div>
<span style="font-family: "courier new"; font-size: x-small;">-98.79</span></div>
<div>
<br /></div>
<div>
Given multiple other issues in 2014 with Oracle NUMBER data types, there does seem to be reason to suspect that either there is a bug:</div>
<div>
<a href="https://support.microsoft.com/en-us/kb/3138659">FIX: Slow performance when you query numeric data types from an Oracle database</a></div>
<div>
<a href="https://support.microsoft.com/en-us/kb/3051993">FIX: The value of NUMBER type is truncated when you select data from an Oracle-linked server by using OLE DB provider</a></div>
<div>
<br /></div>
<div>
...Or that for whatever reason the implicit conversion in 2014 is rounding to the nearest even.</div>
<div>
<br /></div>
<div>
Perhaps the 2014 SQL Server is implicitly converting to <a href="https://msdn.microsoft.com/en-us/library/ms151817(v=sql.105).aspx">float</a>, using the nearest even prior to the explicit cast to Numeric. However, how the scale (number of decimal digits that will be stored to the right of the decimal point) would be determined in such a scenario is a conundrum. Either way, although <a href="https://msdn.microsoft.com/en-us/library/ms151817(v=sql.120).aspx">the mapping is defined the same</a>, the behavior demonstrated between the two versions of SQL Server is inconsistent.</div>
<div>
<br /></div>
<div>
Research into ANSI and IEEE both boil down to truncation and/or rounding is implementation defined.</div>
<div>
<br /></div>
<div>
I ultimately used the following workaround based on Oracle's rounding behavior as defined by its <a href="https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions135.htm">ROUND()</a> Function:</div>
<div>
<br /></div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">*</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Openquery</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[mylinkedserver]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'select UNDEFINED, ROUND(UNDEFINED,2) AS </span></div>
<div>
<span style="color: red; font-family: "courier new"; font-size: x-small;">IConvert from </span><span style="color: red; font-family: "courier new"; font-size: x-small;">[MySchema].</span><span style="color: red; font-family: "courier new"; font-size: x-small;">NUMBER_TEST order by 1'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">Numbers</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<div>
<br /></div>
<div>
I then had results in 2014 that match what is returned from 2008 R2.</div>
<div>
<br /></div>
<div>
The lesson here is to make sure any conversions, or rounding, happen as far upstream as possible.</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
POST:</div>
<div>
I was blogging as I went with this. Here's one last FLOAT test I ran:</div>
<div>
<br /></div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">TRUNCATE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">TABLE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[MySchema]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">number_test</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">INSERT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INTO</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[MySchema]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">number_test</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">VALUES</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">-</span><span style="font-family: "courier new"; font-size: x-small;">54321.785</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">INSERT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INTO</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[MySchema]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">number_test</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">VALUES</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">-</span><span style="font-family: "courier new"; font-size: x-small;">98.785</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">INSERT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INTO</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[MySchema]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">number_test</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">VALUES</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">98.785</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">INSERT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INTO</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[MySchema]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">number_test</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">VALUES</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">54321.785</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">COMMIT</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div>
SQL Server Queries:</div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><span style="color: blue;">SELECT</span> <span style="color: maroon;">undefined</span><span style="color: silver;">,</span><br /> <span style="color: magenta;"><i>Cast</i></span><span style="color: maroon;">(</span><span style="color: maroon;">undefined</span> <span style="color: blue;">AS</span> <span style="background-color: yellow;"><span style="color: black;"><i>NUMERIC</i></span><span style="color: maroon;">(</span><span style="color: black;">20</span><span style="color: silver;">,</span> <span style="color: black;">2</span><span style="color: maroon;">)</span></span><span style="color: maroon;">)</span> <span style="color: blue;">AS</span> <span style="color: maroon;">IConvert</span><br /><span style="color: blue;">FROM</span> <span style="color: magenta;"><i>Openquery</i></span><span style="color: maroon;">(</span></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[mylinkedserver]</span><span style="font-family: "courier new"; font-size: x-small;"><span style="color: silver;">,</span> <span style="color: red;">'select * from </span></span><span style="color: red; font-family: "courier new"; font-size: x-small;">[MySchema].</span><span style="font-family: "courier new"; font-size: x-small;"><span style="color: red;">NUMBER_TEST'</span><span style="color: maroon;">)</span> </span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><span style="color: maroon;">Numbers</span><br /><br /><span style="color: blue;">SELECT</span> <span style="color: maroon;">undefined</span><span style="color: silver;">,</span><br /> <span style="color: magenta;"><i>Cast</i></span><span style="color: maroon;">(</span><span style="color: maroon;">undefined</span> <span style="color: blue;">AS</span> <span style="background-color: yellow;"><span style="color: black;"><i>FLOAT</i></span><span style="color: maroon;">(</span><span style="color: black;">1</span><span style="color: maroon;">)</span></span><span style="color: maroon;">)</span> <span style="color: blue;">AS</span> <span style="color: maroon;">IConvert</span><br /><span style="color: blue;">FROM</span> <span style="color: magenta;"><i>Openquery</i></span><span style="color: maroon;">(</span></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[mylinkedserver]</span><span style="font-family: "courier new"; font-size: x-small;"><span style="color: silver;">,</span> <span style="color: red;">'select * from </span></span><span style="color: red; font-family: "courier new"; font-size: x-small;">[MySchema].</span><span style="font-family: "courier new"; font-size: x-small;"><span style="color: red;">NUMBER_TEST'</span><span style="color: maroon;">)</span> </span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><span style="color: maroon;">Numbers</span> </span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div>
Results:</div>
<div>
SQL Server 2014</div>
<table border="1">
<tbody>
<tr><td><b> undefined </b></td><td><b> IConvert </b></td></tr>
<tr><td>-54321.785 </td><td><span style="background-color: lime;">-54321.79 </span></td></tr>
<tr><td>-98.785 </td><td><span style="background-color: yellow;">-98.78 </span></td></tr>
<tr><td>98.785 </td><td><span style="background-color: yellow;">98.78 </span></td></tr>
<tr><td>54321.785 </td><td><span style="background-color: lime;">54321.79 </span></td></tr>
</tbody></table>
When the data has a 7 digit precision and the <a href="https://msdn.microsoft.com/en-us/library/ms173773.aspx">float definition</a> limits the scale to 2 we see a different rounding behavior.<br />
<br />
<table border="1">
<tbody>
<tr><td><b> undefined </b></td><td><b> IConvert </b></td></tr>
<tr><td>-54321.785 </td><td>-54321.79 </td></tr>
<tr><td>-98.785 </td><td>-98.785 </td></tr>
<tr><td>98.785 </td><td>98.785 </td></tr>
<tr><td>54321.785 </td><td>54321.79 </td></tr>
</tbody></table>
<br />
SQL Server 2008 R2
<br />
<table border="1">
<tbody>
<tr><td><b> undefined </b></td><td><b> IConvert </b></td></tr>
<tr><td>-54321.785 </td><td>-54321.79 </td></tr>
<tr><td>-98.785 </td><td><span style="background-color: yellow;">-98.79 </span></td></tr>
<tr><td>98.785 </td><td><span style="background-color: yellow;">98.79 </span></td></tr>
<tr><td>54321.785 </td><td>54321.79 </td></tr>
</tbody></table>
<br />
<table border="1">
<tbody>
<tr><td><b> undefined </b></td><td><b> IConvert </b></td></tr>
<tr><td>-54321.785 </td><td>-54321.79 </td></tr>
<tr><td>-98.785 </td><td>-98.785 </td></tr>
<tr><td>98.785 </td><td>98.785 </td></tr>
<tr><td>54321.785 </td><td>54321.79 </td></tr>
</tbody></table>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-39231064249829535272016-07-01T09:41:00.000-07:002016-07-01T09:41:17.715-07:00sp_help returns the wrong length, or does it?This is just a quick blog about a basic sp_help / data type storage gotcha. <br />
<br />
I was working with a contractor today who was having difficulty providing me back details on a table definition. I was specifically interested in a particular column's data type and size. (This was related to an ETL process I was working on, and my desire to avoid any implicit conversions).<br />
<br />
The reply I got back was, "the column you're interested in is an nvarchar(100)". After continued digging and troubleshooting, I was eventually able to sort out that it was actually an nvarchar(50).<br />
<br />
I put together this TEST table to illustrate where the confusion came from. Can you spot what's going on?<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOu0UrgA45kL9bSTS8lYV8dJJ8IN3dI5vUPZtUqdWxFouNYOypzAh6FI-ZLMSSY6LBV4roY6AHSSwKD4mxPEsNLyTnYM6wwd07bFY2eiaOJBNbYJa0Z4d2lFs3qryWp3te6hhQBdvVEaE/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="445" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOu0UrgA45kL9bSTS8lYV8dJJ8IN3dI5vUPZtUqdWxFouNYOypzAh6FI-ZLMSSY6LBV4roY6AHSSwKD4mxPEsNLyTnYM6wwd07bFY2eiaOJBNbYJa0Z4d2lFs3qryWp3te6hhQBdvVEaE/s640/Capture.PNG" width="640" /></a></div>
<br />
<br />
INFORMATION_SCHEMA.COLUMNS returns nvarchar (50), but sp_help returns nvarchar(100)! Surely there's a bug with SQL Server!!!<br />
<br />
uh, no.<br />
<br />
sp_help returns the, "physical length of the data type (in bytes)."<br />
<br />
In simplistic terms:<br />
char - The storage size is n bytes.<br />
nchar - The storage size is two times n bytes.<br />
<br />
Details and References are here:<br />
<a href="https://msdn.microsoft.com/en-us/library/ms187335.aspx?f=255&MSPPError=-2147217396">sp_help</a><br />
<a href="https://msdn.microsoft.com/en-us/library/ms176089.aspx?f=255&MSPPError=-2147217396">char and varchar</a><br />
<a href="https://msdn.microsoft.com/en-us/library/ms186939.aspx">nchar and nvarchar</a><br />
<a href="https://msdn.microsoft.com/en-us/library/ms187752.aspx">Data Types</a>Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com1tag:blogger.com,1999:blog-6818257984018916419.post-49449727269099381512016-06-23T07:58:00.000-07:002016-06-23T07:58:57.251-07:00Full-Text Search with Page NumbersIn my last blog post, <a href="http://jonmorisissqlblog.blogspot.com/2016/06/setting-up-full-text-search-for-pdf.html">Setting up Full-Text Search for PDF files</a>, I detailed how to get things setup. If you tried this you may have noticed that although the searches worked, what you got back was a file name. This isn't so helpful if your document is an all encompassing 538 pages. So, how do we get a page number back? The best I've come up with so far is to split the 538 pages into 538 documents and load / search on those.<br />
<br />
My first google search on how to split a pdf into pages came back with, <a href="http://www.splitpdf.com/">http://www.splitpdf.com/</a>, so I went ahead and used that. I'm sure there is a way to do this through acrobat or even roll your own split functionality via the <a href="http://www.adobe.com/devnet/acrobat.html">API</a>.<br />
<br />
The website was simple enough. Drag and drop a file, check the "Extract all pages into separate files", click "Split!", download the zip, and extract.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikdO3SCH4BBKkkxZlRhH6kLjbaSECE9KhPdcgrptlSMJ8A6axohlvgBK7rjgcznfXLAmMy-wtV1hA4dHtbhoK9t-SB0LBnLlhxAc23uNoSLRMOJigUJWd5L4gL_MHXlOluymB6fm1RAbA/s1600/Capture01.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="297" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikdO3SCH4BBKkkxZlRhH6kLjbaSECE9KhPdcgrptlSMJ8A6axohlvgBK7rjgcznfXLAmMy-wtV1hA4dHtbhoK9t-SB0LBnLlhxAc23uNoSLRMOJigUJWd5L4gL_MHXlOluymB6fm1RAbA/s640/Capture01.PNG" width="640" /></a></div>
<br />
Now we have to load 538 pages into our table. There are multiple ways to do this; it boils down to a directory listing and string parsing to generate the inserts. Here's how I did it:<br />
<br />
<i style="color: green; font-family: "Courier New"; font-size: small;">--enable xp_cmdshell</i><br />
<span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>Sp_configure</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'show advanced options'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">RECONFIGURE</span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><br />
<span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b><br /></b></span>
<span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>Sp_configure</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'xp_cmdshell'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">RECONFIGURE</span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SET</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">nocount</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">ON</span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">CREATE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">TABLE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">#inputfiles</span><span style="font-family: "courier new"; font-size: x-small;"></span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">seqid</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>[NUMERIC]</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">3</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;">0</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">IDENTITY</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">fn</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>NVARCHAR</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">max</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">;</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">INSERT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INTO</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">#inputfiles</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--displays full path</i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">EXEC</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>xp_cmdshell</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'dir "D:\Files\User Guide" /s/b'</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i><br /></i></span>
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--I ended up with a null row for some reason...</i></span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">DELETE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">#inputfiles</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">fn</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">IS</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NULL</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--Loop the temp table performing the inserts</i></span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">DECLARE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Counter</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>INT</i></span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">DECLARE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Max</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>INT</i></span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">DECLARE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@SQL</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>NVARCHAR</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">max</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">DECLARE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@FileName</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>NVARCHAR</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">max</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SET</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Counter</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">;</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SET</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Max</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">= </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Max</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">seqid</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">#inputfiles</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">;</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHILE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Counter</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;"><=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Max</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">BEGIN</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">SET</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@FileName</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">#inputfiles</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">fn</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">#inputfiles</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">seqid</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Counter</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">SET</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@SQL</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="color: red; font-family: "courier new"; font-size: x-small;">'INSERT INTO [FULLTEXTDEMO].[dbo].[files] ([filename], [filetype], [file]) SELECT '</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Char</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">39</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@FileName</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Char</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">39</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: red; font-family: "courier new"; font-size: x-small;">','</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Char</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">39</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: red; font-family: "courier new"; font-size: x-small;">'.pdf'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Char</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">39</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: red; font-family: "courier new"; font-size: x-small;">',bulkcolumn FROM OPENROWSET(BULK '</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Char</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">39</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@FileName</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Char</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">39</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="color: red; font-family: "courier new"; font-size: x-small;">', single_blob) AS input'</span><span style="font-family: "courier new"; font-size: x-small;"></span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">EXECUTE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>sp_executesql</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@SQL</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">SET</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Counter</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@Counter</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">END</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<br style="font-family: "Courier New"; font-size: small;" />
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--Cleanup</i></span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">DROP</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">TABLE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">#inputfiles</span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SET</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">nocount</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">OFF</span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><br />
<span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b><br /></b></span>
<span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>Sp_configure</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'xp_cmdshell'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;">0</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">RECONFIGURE</span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><br />
<span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b><br /></b></span>
<span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>Sp_configure</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'show advanced options'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;">0</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">RECONFIGURE</span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;"><br /></span>
You can verify the population by querying the <a href="https://msdn.microsoft.com/en-us/library/ms190370.aspx">FULLTEXTCATALOGPROPERTY PopulateStatus</a>:<br />
<br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Fulltextcatalogproperty</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">c</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NAME</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'PopulateStatus'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">sys</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">fulltext_catalogs</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">c</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
Here's a search I ran on the example data:<br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'file://'</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">AS</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[File]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>CONTAINS</b></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'NEAR(("Admin*","Client"),2)'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FOR</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">xml</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">raw</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
Click<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLYn2H8xYN6VLY66BpXBfnMB6LJVMMPfV6DTxfEBss9uP0RS1f-ZGz21sDRm53hXdqsq4FBwANkvOZwLjkyro1fnIypDqMjVQaiRnDrNKy5tabzlhVjwj6q4l_K-psdltJfKtMlY4LewI/s1600/Capture02.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="40" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLYn2H8xYN6VLY66BpXBfnMB6LJVMMPfV6DTxfEBss9uP0RS1f-ZGz21sDRm53hXdqsq4FBwANkvOZwLjkyro1fnIypDqMjVQaiRnDrNKy5tabzlhVjwj6q4l_K-psdltJfKtMlY4LewI/s640/Capture02.PNG" width="640" /></a></div>
<br />
Click<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfeGa50FomOJaZlzx2QXS5LLD81sVQ95j92PT22TWmg-C-_b5NKmFfvC1Yi-VgWhV4l7sjep6iUI6eKvptbvlEzu8bBf-mFNTWrG3euh2fFOh1YRb7XEVb9z9rsFRQeJ4GjZ8jKYVPSRo/s1600/Capture03.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfeGa50FomOJaZlzx2QXS5LLD81sVQ95j92PT22TWmg-C-_b5NKmFfvC1Yi-VgWhV4l7sjep6iUI6eKvptbvlEzu8bBf-mFNTWrG3euh2fFOh1YRb7XEVb9z9rsFRQeJ4GjZ8jKYVPSRo/s1600/Capture03.PNG" /></a></div>
Boom<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6752JdnOu_up2lEFkkzUrNGof_DqL0ln3DJqjBcMAYenuV1kGFCp5u2ohLOXYfKNdAMBLAQ3wEBamFvUhOwhcS-OYYpvAfZSyUco_iWF8HwkQoLUgWJ6UGjy8qmJZe7m9e83ZyMpgHBQ/s1600/Capture04.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="94" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6752JdnOu_up2lEFkkzUrNGof_DqL0ln3DJqjBcMAYenuV1kGFCp5u2ohLOXYfKNdAMBLAQ3wEBamFvUhOwhcS-OYYpvAfZSyUco_iWF8HwkQoLUgWJ6UGjy8qmJZe7m9e83ZyMpgHBQ/s640/Capture04.PNG" width="640" /></a></div>
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-89892916275405180512016-06-01T09:35:00.000-07:002016-06-01T09:35:40.258-07:00Setting up Full-Text Search for PDF filesHas this ever happened to you? You're assigned to work on a new project with a new vendor. The vendor has a website that's very difficult to browse for documentation. Some of it's under one domain name, some of its under another domain name. Some of the relevant documentation for the version you're on is actually published under a prior version or edition. There's plenty of documentation, it's scattered about, and it's all...PDF files.<br />
<br />
Faced with this very issue, I decided to setup a local <a href="https://msdn.microsoft.com/en-us/library/ms142571.aspx">SQL Server Full-Text Search</a>.<br />
Some of the cool things Full-Text Search will give you, over and above, a standard search include the following:<br />
<ul>
<li>One or more specific words or phrases (simple term)</li>
<li>A word or a phrase where the words begin with specified text (prefix term)</li>
<li>Inflectional forms of a specific word (generation term)</li>
<li>A word or phrase close to another word or phrase (proximity term)</li>
<li>Synonymous forms of a specific word (thesaurus)</li>
<li>Words or phrases using weighted values (weighted term)</li>
</ul>
<div>
In order to get stared with the setup, it's important to know that the Full-Text Search architecture relies on filters for searching various file types. This is important for this example because the PDF filter is not installed by default. So, for starters, we need to go download and install the <a href="http://www.adobe.com/support/downloads/detail.jsp?ftpID=5542">PDF ifilter</a> (PDFFilter64Setup.msi).</div>
<div>
<br /></div>
<div>
This is a next, next, finish install:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0pC7GCA_0PE-WLdGf-nhaY6HPv4amFDSxzhARtjHbPV36fbx7TfnYGBdLy5X8kUii4rbJvvCzBehaPsYzz7PRSgcdO4pPoZp7TdP2JKvuDcDngGr-CWLwpmKt314ZVjFiHSqgBiLCnN0/s1600/Capture01.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0pC7GCA_0PE-WLdGf-nhaY6HPv4amFDSxzhARtjHbPV36fbx7TfnYGBdLy5X8kUii4rbJvvCzBehaPsYzz7PRSgcdO4pPoZp7TdP2JKvuDcDngGr-CWLwpmKt314ZVjFiHSqgBiLCnN0/s1600/Capture01.PNG" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-AWGPsaJcBCzrqsfTSmKzPZHn51k-ypwk_a9SdcuSCPKTU3KTtuWoub5gscZHuCO8_iK9eg20zb6OGSIl0GiFJEUQMnIyayZ4cUfOlv-iGb3_ujXBkDuYPRF8V1ZrxrgOGJQ8Iq144TU/s1600/Capture02.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-AWGPsaJcBCzrqsfTSmKzPZHn51k-ypwk_a9SdcuSCPKTU3KTtuWoub5gscZHuCO8_iK9eg20zb6OGSIl0GiFJEUQMnIyayZ4cUfOlv-iGb3_ujXBkDuYPRF8V1ZrxrgOGJQ8Iq144TU/s1600/Capture02.PNG" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilZ8P47BAvp7xNt4ugxOcLcrFtsAjKTZDI-A1kVZUKhInvdjWUeXLqvYyMETGtR7i-Z8EaVs4ZJOqtxPG-dGDm2MKAKY1xpJRLAI_yf_oXyv7ILualwsU4pq05h_iQZ5MtTJt4gmyhijo/s1600/Capture03.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilZ8P47BAvp7xNt4ugxOcLcrFtsAjKTZDI-A1kVZUKhInvdjWUeXLqvYyMETGtR7i-Z8EaVs4ZJOqtxPG-dGDm2MKAKY1xpJRLAI_yf_oXyv7ILualwsU4pq05h_iQZ5MtTJt4gmyhijo/s1600/Capture03.PNG" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS25jmeUA5GswW1bKjS3n1jM6ttinMavb8pVS6VbkOKf1fv8nnH_z352p3tx3H_1P9I_HracxGEgnfCZRDntJyVmM9Q2ihikFDkF-M4t2mEB6joBiUKlrwLVyafEqZoeW2Y42tKVGibw8/s1600/Capture04.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS25jmeUA5GswW1bKjS3n1jM6ttinMavb8pVS6VbkOKf1fv8nnH_z352p3tx3H_1P9I_HracxGEgnfCZRDntJyVmM9Q2ihikFDkF-M4t2mEB6joBiUKlrwLVyafEqZoeW2Y42tKVGibw8/s1600/Capture04.PNG" /></a></div>
</div>
<div>
There is a trick with getting the installation setup correctly; you have to setup a system variable. Here's how I did it.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVj_lYSZnulhGU-T6kZC2KTvECodV1gLZu3eJ937xict56oJAWPHnqBBoa0jeFfVTqVq-lEcfL8U1mQ3Otx39JbN4gL5wg73SQcANz8TcPkbNHbchRYZ92eEhArvR2f8JFbOEZCzmJbOk/s1600/Capture07.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVj_lYSZnulhGU-T6kZC2KTvECodV1gLZu3eJ937xict56oJAWPHnqBBoa0jeFfVTqVq-lEcfL8U1mQ3Otx39JbN4gL5wg73SQcANz8TcPkbNHbchRYZ92eEhArvR2f8JFbOEZCzmJbOk/s1600/Capture07.PNG" /></a></div>
<div>
<br /></div>
<div>
Environment Variables:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0RUU9CKpUyot3iu5lStkjQamBH69p4e8Uj2tQ0KbGE6q-VIWaKC6U9z2HgRfbScz0QPdCK2JygBGBlSeKeVI07IKGYWSQKDz73VRAQlq6vaB42JdKERIMsH58N3-6Ff8SKWiEInfWCng/s1600/Capture08.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0RUU9CKpUyot3iu5lStkjQamBH69p4e8Uj2tQ0KbGE6q-VIWaKC6U9z2HgRfbScz0QPdCK2JygBGBlSeKeVI07IKGYWSQKDz73VRAQlq6vaB42JdKERIMsH58N3-6Ff8SKWiEInfWCng/s1600/Capture08.PNG" /></a></div>
<div>
<br /></div>
<div>
From Environment Variables I first created a new Variable "AdobeiFilter" and pointed it to the bin directory associated with the install:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhM5JPB55rXLKFcOSo5Us-ha1JOsQL5bk1feDFvIoghBSYWqq-7IKu9wXCh1olFNL4oEEYXLz7jsm3rr5SLSSezTEPms5w97ZiKA8VHUbcE0jKT9HXGjWRba7nbJKLEJzZsb3-xHv5o9ww/s1600/Capture09.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhM5JPB55rXLKFcOSo5Us-ha1JOsQL5bk1feDFvIoghBSYWqq-7IKu9wXCh1olFNL4oEEYXLz7jsm3rr5SLSSezTEPms5w97ZiKA8VHUbcE0jKT9HXGjWRba7nbJKLEJzZsb3-xHv5o9ww/s1600/Capture09.PNG" /></a></div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJkxTZFIGdbEFQyJvlQrXOlpZq4cBl9dvAetrbnlHW0RzeD7ahOq9kPEsBjIfkJY3wf-lKpC-r0ZhBVc90l_NwK2MdmAlvshKg7QoYXXiPQSTjqzOF1EqxxFMWzNwlnCGhJIDRN8PdZp4/s1600/Capture10.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJkxTZFIGdbEFQyJvlQrXOlpZq4cBl9dvAetrbnlHW0RzeD7ahOq9kPEsBjIfkJY3wf-lKpC-r0ZhBVc90l_NwK2MdmAlvshKg7QoYXXiPQSTjqzOF1EqxxFMWzNwlnCGhJIDRN8PdZp4/s1600/Capture10.PNG" /></a></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
I then clicked Edit on the Path Variable, in environment Variables, and appended %AdobeiFilter% to the end:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWAf2RusG0-DC__vmMvPpHGVw_XzChEtYm3tfwb3DNjFo5tbx7iMmlniheVes7X_8zzM36rNNuoXvhavKeJ7GFd0uUnW5Sb9smvrtqJmGrCm_qP4UYbuKsm8dF4D0zbA1YtXST1avglHo/s1600/Capture11.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWAf2RusG0-DC__vmMvPpHGVw_XzChEtYm3tfwb3DNjFo5tbx7iMmlniheVes7X_8zzM36rNNuoXvhavKeJ7GFd0uUnW5Sb9smvrtqJmGrCm_qP4UYbuKsm8dF4D0zbA1YtXST1avglHo/s1600/Capture11.PNG" /></a></div>
<div>
<br /></div>
<div>
I can validate this by opening run and typing "%AdobeiFilter%" which drops me to my bin directory here: "C:\Program Files\Adobe\Adobe PDF iFilter 11 for 64-bit platforms\bin"</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnA_sz21ohM_GQlYLpGTOfoOU2XTDd8B_pXSnOqbKPgA_JbjtQETx0hbtaDBNDRBKAb2o6lq-_sRolZRGZG_ijPwmZrqv20ZsA7yzB7JqrVB0Nb_JlBIoexEyeT-oiGGbZSWSgkXu1WEs/s1600/Capture12.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnA_sz21ohM_GQlYLpGTOfoOU2XTDd8B_pXSnOqbKPgA_JbjtQETx0hbtaDBNDRBKAb2o6lq-_sRolZRGZG_ijPwmZrqv20ZsA7yzB7JqrVB0Nb_JlBIoexEyeT-oiGGbZSWSgkXu1WEs/s1600/Capture12.PNG" /></a></div>
<div>
<br /></div>
<div>
<i><span style="font-size: x-small;">You may also need to grant your SQL Server Service account access to this folder.</span></i></div>
<div>
<br /></div>
<div>
Next we'll configure SQL Server Full-Text. You can check to see if it's already installed by running the following:</div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: magenta; font-family: "courier new"; font-size: x-small;"><i>Serverproperty</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: red; font-family: "courier new"; font-size: x-small;">'IsFullTextInstalled'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div>
If it's not installed you can run through the standard SQL Installation choosing "New SQL Server stand-alone installation or add features to an existing installation":</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIylCxkG2cibvKSYAOFPgPj7WHlbjnHEiUaoquaTSizexfTji6Oh8-U5lgGX2yRKo4mcKCHjUnVnjRNemaBw1WvBQcIBJj1LzhJiU1qijxna12wqFN8xB69TnNqDedqQlluNPqkBsXXFU/s1600/Capture05.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="476" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIylCxkG2cibvKSYAOFPgPj7WHlbjnHEiUaoquaTSizexfTji6Oh8-U5lgGX2yRKo4mcKCHjUnVnjRNemaBw1WvBQcIBJj1LzhJiU1qijxna12wqFN8xB69TnNqDedqQlluNPqkBsXXFU/s640/Capture05.PNG" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
The feature is listed as "Full-Text and Semantic Extractions for search":</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGksoekBcqexM3bZRc-7jdLArCTleVQbsnArKRFoA75JrA1A7dPaZ-3TKwHNc3P_dtkuoLFKXZ_jyYs6eN_4IqnaJ2oGMf-Rzx4GftlbuBsdZ8a369qHesmSvLxHdMGgogDh4JI0bpVPc/s1600/Capture06.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGksoekBcqexM3bZRc-7jdLArCTleVQbsnArKRFoA75JrA1A7dPaZ-3TKwHNc3P_dtkuoLFKXZ_jyYs6eN_4IqnaJ2oGMf-Rzx4GftlbuBsdZ8a369qHesmSvLxHdMGgogDh4JI0bpVPc/s640/Capture06.PNG" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Next we run a couple of <span style="color: #8000ff; font-family: "courier new"; font-size: x-small;"><a href="https://msdn.microsoft.com/en-us/library/ms175058.aspx">sp_fulltext_service</a></span><span style="font-family: "courier new"; font-size: x-small;"> </span>commands:</div>
<div class="separator" style="clear: both; text-align: left;">
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--Load operating system filters and word breakers</i></span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">EXEC</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>Sp_fulltext_service</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@action</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="color: red; font-family: "courier new"; font-size: x-small;">'load_os_resources'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@value</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">;</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<br style="font-family: 'Courier New'; font-size: small;" />
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--Do not verify whether binaries are signed</i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">EXEC</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>Sp_fulltext_service</b></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@action</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="color: red; font-family: "courier new"; font-size: x-small;">'verify_signature'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #8000ff; font-family: "courier new"; font-size: x-small;">@value</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;">0</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">;</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div class="separator" style="clear: both; text-align: left;">
Validate SQL is associating Full-Text and the PDF filter:</div>
<div class="separator" style="clear: both; text-align: left;">
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">document_type</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">path</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><a href="https://msdn.microsoft.com/en-us/library/ms174373.aspx"><span style="color: maroon; font-family: "courier new"; font-size: x-small;">sys</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">fulltext_document_types</span></a><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">document_type</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'.pdf'</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<div>
<br /></div>
<div>
<i><span style="font-size: x-small;">You may need to restart the SQL Service, possibly the box, at this point if the association isn't there.</span></i></div>
<div>
<i><span style="font-size: x-small;"><br /></span></i></div>
<div>
For this blog I created a new database, "FULLTEXTDEMO". By default full-text indexing was enabled at the time of creation. This can also be checked with the following query:</div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NAME</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">is_fulltext_enabled</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">sys</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">databases</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">ORDER</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">BY</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NAME</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<div>
<br /></div>
<div>
Next I created a table to hold my full-text pdf files:</div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">USE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<br style="font-family: 'Courier New'; font-size: small;" />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">CREATE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">TABLE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[id]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>[INT]</i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">IDENTITY</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;">1</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NOT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NULL</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>[NVARCHAR]</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">max</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NULL</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filetype]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>VARCHAR</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;">5</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NULL</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--File type is required for creating a Full-Text Index</i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="font-family: "courier new"; font-size: x-small;"><i>[VARBINARY]</i></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">max</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">NULL</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">CONSTRAINT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[PK_files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">PRIMARY</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">KEY</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">CLUSTERED</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[id]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">ASC</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="color: blue; font-family: "courier new"; font-size: x-small;">WITH</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">pad_index</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">OFF</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">statistics_norecompute</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">OFF</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">ignore_dup_key</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">OFF</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">allow_row_locks</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">on</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">allow_page_locks</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">on</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">ON</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[PRIMARY]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">ON</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[PRIMARY]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">textimage_on</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[PRIMARY]</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div>
Next we create a Full-Text Catalog</div>
<div>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">USE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">CREATE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">fulltext</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">catalog</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FILESCATALOG]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: maroon; font-family: "courier new"; font-size: x-small;">go</span><span style="font-family: "courier new"; font-size: x-small;"> </span></div>
<div>
<span style="font-family: "courier new"; font-size: x-small;"><br /></span></div>
<div>
...and a full-text index to go with it:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-VI7TVxQHS_ZKKakKiPWLZumkXuW7fRoNENrVd56owFnpWwMvzGt2La7d48bbZzedJF6IFVyEIG5pMUbtrM8YSXWPh4gmpjiKQwxk0fEZ_crMdpbSvDJt783ozm4k_aY5qethzJU70yQ/s1600/Capture19.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-VI7TVxQHS_ZKKakKiPWLZumkXuW7fRoNENrVd56owFnpWwMvzGt2La7d48bbZzedJF6IFVyEIG5pMUbtrM8YSXWPh4gmpjiKQwxk0fEZ_crMdpbSvDJt783ozm4k_aY5qethzJU70yQ/s1600/Capture19.PNG" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHFMawx__ssQdonz9faBw6UuT3Z0O0rrq1X9CBT7yqiqjPb3b2bFUkY4zyfZbINX1sJCf74e0Le1RLG7Rl6iVi2wSVd4kqnAUTdvHYztFQwbtgjAWhM-95R0Ppag5VqMEo388LNfvXpEc/s1600/Capture13.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHFMawx__ssQdonz9faBw6UuT3Z0O0rrq1X9CBT7yqiqjPb3b2bFUkY4zyfZbINX1sJCf74e0Le1RLG7Rl6iVi2wSVd4kqnAUTdvHYztFQwbtgjAWhM-95R0Ppag5VqMEo388LNfvXpEc/s1600/Capture13.PNG" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIUVZK89YGlK4nqkXqwxNi_3yPalq62E9IUqYMSVOtdkqf04k_enjwttwv0ETa8_eh9dz1f_XOyOhnY1B9bNxgBC7PEVzwjN24qe41tFei4AwB7uffdgW19NLQlRhE4Fg0-sZCOBjNlRw/s1600/Capture14.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIUVZK89YGlK4nqkXqwxNi_3yPalq62E9IUqYMSVOtdkqf04k_enjwttwv0ETa8_eh9dz1f_XOyOhnY1B9bNxgBC7PEVzwjN24qe41tFei4AwB7uffdgW19NLQlRhE4Fg0-sZCOBjNlRw/s1600/Capture14.PNG" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Statistical Semantic Search is really interesting, and it requires a separate setup. Details are here, if you're interested:</div>
<a href="https://msdn.microsoft.com/en-us/library/gg509085%28v=sql.120%29.aspx?f=255&MSPPError=-2147217396">Install and Configure Semantic Search</a><br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjfEiee5EW_dZag9PJw7BruKxJEw3OpoA5VDu6j1wqoM6aj-PmkBmI6APtXcR8w9QB3L7v4-r1jQ4EtHzfEkg4xpb3EygT_tokLGYWGO0cFKB3j-w0JIXRwa65rMVnPJHooHm6-GnmROg/s1600/Capture15.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjfEiee5EW_dZag9PJw7BruKxJEw3OpoA5VDu6j1wqoM6aj-PmkBmI6APtXcR8w9QB3L7v4-r1jQ4EtHzfEkg4xpb3EygT_tokLGYWGO0cFKB3j-w0JIXRwa65rMVnPJHooHm6-GnmROg/s1600/Capture15.PNG" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhl7EZFZT9JB7C3fnN_MpW491C6wx5WJTI0gvlhZtPTdiE1w1b9uB1Gj2_NzFhNCreX-bmU57V1g-c0nVJAWqQ7z_8bvbA52-SUNestZqbpGRBFGbfKoKITTh9SkFihNnyML_X9FfXFww/s1600/Capture16.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhl7EZFZT9JB7C3fnN_MpW491C6wx5WJTI0gvlhZtPTdiE1w1b9uB1Gj2_NzFhNCreX-bmU57V1g-c0nVJAWqQ7z_8bvbA52-SUNestZqbpGRBFGbfKoKITTh9SkFihNnyML_X9FfXFww/s1600/Capture16.PNG" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhKGzA81mkViYr1MmNujBdCIdXvGwMO5eBcaza8PWM3f4GKCLuzvPBn2fbvyl_9O8VxTcfll-yVsvszhMMDoF91ZNstpblIKttYH-KI5qZXNiuxiT_zkGclRxvBctipo99vQqSpaNZ-Ts/s1600/Capture17.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhKGzA81mkViYr1MmNujBdCIdXvGwMO5eBcaza8PWM3f4GKCLuzvPBn2fbvyl_9O8VxTcfll-yVsvszhMMDoF91ZNstpblIKttYH-KI5qZXNiuxiT_zkGclRxvBctipo99vQqSpaNZ-Ts/s1600/Capture17.PNG" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQ2ozzX1IAAfj1_hBPsBiajmhFvv-o22_fkgaO-1CnmY_Idtgmi-kN_xm5T2Sa_yyrODY3sJod2U8t3brLvRvObc3JnRpN837Q0eebTPPdPLP_Zv1Ajw9I6mD6RQPjjLnrdS47ygTeLew/s1600/Capture18.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQ2ozzX1IAAfj1_hBPsBiajmhFvv-o22_fkgaO-1CnmY_Idtgmi-kN_xm5T2Sa_yyrODY3sJod2U8t3brLvRvObc3JnRpN837Q0eebTPPdPLP_Zv1Ajw9I6mD6RQPjjLnrdS47ygTeLew/s1600/Capture18.PNG" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
Next, I loaded a couple of files:<br />
<br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">INSERT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INTO</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filetype]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'D:\Files\Administrators Guide.pdf'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'.pdf'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">bulkcolumn</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">OPENROWSET</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: blue; font-family: "courier new"; font-size: x-small;">BULK</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'D:\Files\Administrators Guide.pdf'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">single_blob</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">AS</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">input</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: blue; font-family: "courier new"; font-size: x-small;">INSERT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INTO</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filetype]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'D:\Files\Installation Guide.pdf'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'.pdf'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">bulkcolumn</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">OPENROWSET</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: blue; font-family: "courier new"; font-size: x-small;">BULK</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'D:\Files\Installation Guide.pdf'</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">single_blob</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">AS</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">input</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<br />
Finally, some fun with <a href="https://msdn.microsoft.com/en-us/library/ms142583(v=sql.120).aspx">Query with Full-Text Search</a><br />
<br />
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--<a href="https://technet.microsoft.com/en-us/library/ms142538(v=sql.105).aspx">Simple Term</a></i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>CONTAINS</b></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'"SQL Server"'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--<a href="https://technet.microsoft.com/en-us/library/ms142492(v=sql.105).aspx">Prefix Term</a></i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>CONTAINS</b></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'"MS*"'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--<a href="https://technet.microsoft.com/en-us/library/ms142566(v=sql.105).aspx">Generation Term</a></i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>CONTAINS</b></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'FORMSOF(INFLECTIONAL, "install")'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--<a href="https://technet.microsoft.com/en-us/library/ms142568(v=sql.120).aspx">Proximity Term</a></i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>CONTAINS</b></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'NEAR((".NET","SQL Server"),2)'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--<a href="https://technet.microsoft.com/en-us/library/ms142577(v=sql.105).aspx">Weighted Term</a></i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">F</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">KEY_TBL</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">rank</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">AS</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">F</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">INNER</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">JOIN</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">CONTAINSTABLE</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'ISABOUT ("Install*", Windows WEIGHT(0.1), SQL WEIGHT(0.9) ) '</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">AS</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">KEY_TBL</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">ON</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">F</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">id</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">=</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">KEY_TBL</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[key]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">ORDER</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">BY</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">KEY_TBL</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">rank</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">DESC</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
<span style="color: green; font-family: "courier new"; font-size: x-small;"><i>--PROTIP: if you do it like this you can get a clickable link</i></span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">SELECT</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'file://'</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: silver; font-family: "courier new"; font-size: x-small;">+</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[filename]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: blue; font-family: "courier new"; font-size: x-small;">AS</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[File]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FROM</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[FULLTEXTDEMO]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[dbo]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">.</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[files]</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">WHERE</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: #ff0080; font-family: "courier new"; font-size: x-small;"><b>CONTAINS</b></span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">(</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">[file]</span><span style="color: silver; font-family: "courier new"; font-size: x-small;">,</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: red; font-family: "courier new"; font-size: x-small;">'"SQL Server"'</span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">)</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="color: blue; font-family: "courier new"; font-size: x-small;">FOR</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">xml</span><span style="font-family: "courier new"; font-size: x-small;"> </span><span style="color: maroon; font-family: "courier new"; font-size: x-small;">raw</span><span style="font-family: "courier new"; font-size: x-small;"> </span><br />
<span style="font-family: "courier new"; font-size: x-small;"><br /></span>
You can also get crazy with custom synonyms via: <a href="https://technet.microsoft.com/en-us/library/ms142491(v=sql.120).aspx">Configure and Manage Thesaurus Files for Full-Text Search</a></div>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com1tag:blogger.com,1999:blog-6818257984018916419.post-70929055723014221702016-05-25T10:58:00.001-07:002016-06-23T07:58:39.081-07:00R Services RoundupThe <a href="https://blogs.technet.microsoft.com/dataplatforminsider/2016/05/02/get-ready-sql-server-2016-coming-on-june-1st/?platform=hootsuite">SQL Server 2016 release date is June 1, 2016</a>. Among the new features everyone is talking about is "R Services". <br />
<br />
Below is a quick reference of links related to "R Services":<br />
<br />
<a href="https://msdn.microsoft.com/en-us/library/ms144275.aspx">Editions and Components of SQL Server 2016</a><br />
<a href="https://msdn.microsoft.com/en-us/library/mt721284.aspx">Differences in R Features between Editions of SQL Server</a><br />
<a href="https://blogs.technet.microsoft.com/dataplatforminsider/2016/03/29/in-database-advanced-analytics-with-r-in-sql-server-2016/">In-database Advanced Analytics with R in SQL Server 2016</a><br />
There are additional links at the bottom of this article including this video:<br />
<br />
<iframe allowfullscreen="" frameborder="0" height="315" src="https://www.youtube.com/embed/8Sly49zDZEw?list=PL8nfc9haGeb6T3HaGhQWvBz1AqS9d6Zv_" width="560"></iframe><br />
<br />
<a href="https://msdn.microsoft.com/en-us/library/mt591993.aspx?wt.mc_id=WW_CE_DM_OO_BLOG_NONE&f=255&MSPPError=-2147217396">SQL Server R Services Tutorials</a><br />
<a href="http://www.mssqlgirl.com/installing-packages-in-sql-server-r-services.html">Installing Packages in SQL Server R Services</a><br />
<a href="http://sqlinthewild.co.za/index.php/2016/05/24/sql-server-2016-features-r-services/">SQL Server 2016 features: R services</a><br />
<br />
Updates:<br />
<a href="http://blog.revolutionanalytics.com/2016/05/predictive-maintenance-r-code.html">http://blog.revolutionanalytics.com/2016/05/predictive-maintenance-r-code.html</a><br />
<a href="http://www.desertislesql.com/wordpress1/?p=1293">Using Visual Studio to develop R for SQL Server 2016</a><br />
<a href="http://blog.revolutionanalytics.com/2016/06/visualizing-a-flood-with-r.html">Visualizing a flood with R</a><br />
<a href="http://www.r-bloggers.com/what-are-the-best-machine-learning-packages-in-r/">What are the Best Machine Learning Packages in R?</a><br />
<a href="http://www.radacad.com/interactive-r-charts-in-power-bi">Interactive R Charts in Power BI</a><br />
<a href="http://www.desertislesql.com/wordpress1/?p=1332">Creating R Code to run on SQL Server 2016</a><br />
<a href="https://blogs.msdn.microsoft.com/sqlcat/2016/06/16/early-customer-experiences-with-sql-server-r-services/">Early Customer Experiences with SQL Server R Services</a><br />
<br />
<br />
If you're interested in Learning R, there's plenty of books out there too.<br />
<br />
<br />
<script type="text/javascript">
amzn_assoc_placement = "adunit0";
amzn_assoc_search_bar = "false";
amzn_assoc_tracking_id = "jonmorisissql-20";
amzn_assoc_search_bar_position = "top";
amzn_assoc_ad_mode = "search";
amzn_assoc_ad_type = "smart";
amzn_assoc_marketplace = "amazon";
amzn_assoc_region = "US";
amzn_assoc_title = "Learning R";
amzn_assoc_default_search_phrase = "Learning R";
amzn_assoc_default_category = "All";
amzn_assoc_linkid = "27072737dc3712832147c6a9b2f38cfb";
</script>
<script src="//z-na.amazon-adsystem.com/widgets/onejs?MarketPlace=US"></script>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-5499885291400646212016-05-10T09:29:00.001-07:002016-05-10T09:29:27.473-07:00Oracle Linked Server SetupFirst things first, you're going to need to know which version of Oracle you're going to be hitting and download the corresponding client. I'm downloading the 11g 64 bit client from here today:<br />
<a href="http://www.oracle.com/technetwork/database/enterprise-edition/downloads/112010-win64soft-094461.html">http://www.oracle.com/technetwork/database/enterprise-edition/downloads/112010-win64soft-094461.html</a><br />
<br />
Thanks to @Oracle there's a whole rigmarole where you have to register for a login, blah, blah, blah, boo <img border="0" height="16" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikLi0Is74EJs5JrcgbF60J1XUH_khEq39Cv8djs-Ke4v6hAGr3Y7mMr4i0yhz1jjNCrXNgfBcPbwQWbnQCCqjWh-zCKxi5FAlCeTItUpQRwDr3oJsVHh_9cQw73KnyDgHk804nGhhgVtQ/s1600/images.png" width="16" />
<br />
<br />
At this point it's probably a good idea to make sure your Oracle DBAs have assigned you a User /Schema and a password. You may also want to setup your tnsnames.ora file (or just make a copy of a working one), more on the tnsnames.ora file later.<br />
<br />
OK, so now you've sawed off your right arm and provided it to Oracle for the privilege of downloading their client. Unzip the win64_11gR2_client.zip file and drill down to ..\Oracle11g_Client\win64_11gR2_client\client\ and launch setup.exe. I'm running SQL Server 2012 on Windows Server 2012 R2 and I get this annoying "Oracle Client Installer" error message because it doesn't seem to recognize current software. <br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEjkByRE9fsX6YvF6FqlUdaFPZvyo3DHnjxEfvPfJeZ-wqGhTSuvRhyxrsLv_AiTBhj-am-rKTv36sRujwTA0RkDszFwfxxEat_m52JktNDCCcgXHweh9C9MBF0fVnv9fyEV4ki2RBPNA/s1600/1.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEjkByRE9fsX6YvF6FqlUdaFPZvyo3DHnjxEfvPfJeZ-wqGhTSuvRhyxrsLv_AiTBhj-am-rKTv36sRujwTA0RkDszFwfxxEat_m52JktNDCCcgXHweh9C9MBF0fVnv9fyEV4ki2RBPNA/s1600/1.PNG" /></a></div>
<br />
You can safely ignore this error message, by clicking "Yes"<br />
<br />
Next you'll be prompted with "What type of installation do you want?"<br />
More installation type details are <a href="https://docs.oracle.com/cd/B28359_01/install.111/b32003/install_overview.htm" target="_blank">here: https://docs.oracle.com/cd/B28359_01/install.111/b32003/install_overview.htm</a><br />
<br />
I chose Administrator because it, "<span style="background-color: white; color: #222222; font-family: "helvetica neue" , "neue helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 19.6px;">also provides tools that enable you to administer Oracle Database."</span><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj39FA07j20vcuJ1lE9Qx4xuI3HlMcBnMRLylABaAjKz2MNDNdZ8OBMbmV8czkoTJFU26tCyD7L1ZO5RDVKbpHPFwlCZfiHuG0jo1C-P0fNKvvvFH4DQkrMvtZ5jhFsiSqOBGYdSZAZfs0/s1600/2.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="481" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj39FA07j20vcuJ1lE9Qx4xuI3HlMcBnMRLylABaAjKz2MNDNdZ8OBMbmV8czkoTJFU26tCyD7L1ZO5RDVKbpHPFwlCZfiHuG0jo1C-P0fNKvvvFH4DQkrMvtZ5jhFsiSqOBGYdSZAZfs0/s640/2.PNG" width="640" /></a></div>
Pick your language, click next<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCwKNBdO2yUcu8hyihcPGgdBNpctILpVqjEI35sXBVRXiStGso_5euxBjttfmcXRq4r8pIzHyi2r_1XBO7rMU08fIG3kvvbJUIaedexiJ07O4jM1-CK4sHjOKo3J3Ywz2dFP44cbfYjO8/s1600/3.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="476" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCwKNBdO2yUcu8hyihcPGgdBNpctILpVqjEI35sXBVRXiStGso_5euxBjttfmcXRq4r8pIzHyi2r_1XBO7rMU08fIG3kvvbJUIaedexiJ07O4jM1-CK4sHjOKo3J3Ywz2dFP44cbfYjO8/s640/3.PNG" width="640" /></a></div>
On the Specify Installation Location, I like to change the Oracle Base directory so that it's not tied to my login. By default the Base directory will have the login at the end:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdz2xR9SVcdrJIsGiL5Yl2Uaf5NuXL6zS4_Yg0Q0FCOP35n4yNCxle1CDmfl3Ost2wBV1kIJ3NTv2-f6Qfr4S7FAIvPWsIDghasq8jpbYssKG6n7LFJYx7UjjvmkU0zRNKLhbEnsL-_Qw/s1600/4.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="476" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdz2xR9SVcdrJIsGiL5Yl2Uaf5NuXL6zS4_Yg0Q0FCOP35n4yNCxle1CDmfl3Ost2wBV1kIJ3NTv2-f6Qfr4S7FAIvPWsIDghasq8jpbYssKG6n7LFJYx7UjjvmkU0zRNKLhbEnsL-_Qw/s640/4.PNG" width="640" /></a></div>
...I strip off the login from the end like so:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDRmMuxfZaH4Xwtc5Knsg7Hc_BPJwFz0Dg5qCWJAX4tTc5us3ZZH0hAMw_O1bJ533lK5S9e85jZ66EwT4ZuH41kSfGOKWp7eSy0ctr1lL7zVq5iatMrhburS3e6Z0mKbRxm75lMbqrqiM/s1600/5.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDRmMuxfZaH4Xwtc5Knsg7Hc_BPJwFz0Dg5qCWJAX4tTc5us3ZZH0hAMw_O1bJ533lK5S9e85jZ66EwT4ZuH41kSfGOKWp7eSy0ctr1lL7zVq5iatMrhburS3e6Z0mKbRxm75lMbqrqiM/s640/5.PNG" width="640" /></a></div>
It is helpful to have a consistent location across multiple servers.<br />
Click Next and Finish to have the installer do its thing.<br />
<br />
Once the installer is done, the next thing to do is setup that tnsnames.ora file. If your Oracle DBA handed you a file simply copy it to C:\app\product\11.2.0\client_1\network\admin.<br />
<br />
If you need to make one from scratch, there's some basics here:<br />
<a href="http://www.orafaq.com/wiki/Tnsnames.ora">http://www.orafaq.com/wiki/Tnsnames.ora</a><br />
<br />
There are plenty of other options besides those basics. You can dig in here to Load Balancing and the like if you're so inclined:<br />
<a href="https://docs.oracle.com/cd/E11882_01/network.112/e10835/tnsnames.htm#NETRF262">https://docs.oracle.com/cd/E11882_01/network.112/e10835/tnsnames.htm#NETRF262</a><br />
<br />
Now that we have the client installed and our tnsnames file configured, we need to setup our ODBC data source. Start this process by launching ODBC Data Sources:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtLuvYRL5YVF_KgzMzTXNtcAbvrH1jKh-PYt3SpqIzd_JIIKyZ9DtUuSXt0kUDyDzzDCTsOKnDBcr_u9mhNEEdzkoIZWFMMETynbxMqS2uBqHlQj86RxdqxijC7jtq_wCZiCwooE4ahqw/s1600/6.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtLuvYRL5YVF_KgzMzTXNtcAbvrH1jKh-PYt3SpqIzd_JIIKyZ9DtUuSXt0kUDyDzzDCTsOKnDBcr_u9mhNEEdzkoIZWFMMETynbxMqS2uBqHlQj86RxdqxijC7jtq_wCZiCwooE4ahqw/s1600/6.PNG" /></a></div>
Click System DSN and the Add... button:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4UrI1ScNfiZaHzfvU0KIIKya7K_sJTfWRvBwRGFM9BLsi7pcJ3tnodNtzatdTuim3J1PuCGRVknCA0x_D3MeBZthvFgo5m7dT8TZVmbS9kPV-UyBJBpa7TI1voDY2vnu1kvnsJRZ7RDc/s1600/7.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="450" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4UrI1ScNfiZaHzfvU0KIIKya7K_sJTfWRvBwRGFM9BLsi7pcJ3tnodNtzatdTuim3J1PuCGRVknCA0x_D3MeBZthvFgo5m7dT8TZVmbS9kPV-UyBJBpa7TI1voDY2vnu1kvnsJRZ7RDc/s640/7.PNG" width="640" /></a></div>
Select "Oracle in OraClient11g_home1":<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgECbQgPpaNSzmNS0G3pI0ZZlgkbMZKTJPG3VKG40YURmiAzPdZ0_fOmmFCgkhhaQVjc27o2xKSB3e3j7_sQy5_MWQprToNKQ8xOnuJq0rdm42gFyIoGGA2RHiviVKZltxLEMvhfM2XxU/s1600/8.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="482" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgECbQgPpaNSzmNS0G3pI0ZZlgkbMZKTJPG3VKG40YURmiAzPdZ0_fOmmFCgkhhaQVjc27o2xKSB3e3j7_sQy5_MWQprToNKQ8xOnuJq0rdm42gFyIoGGA2RHiviVKZltxLEMvhfM2XxU/s640/8.PNG" width="640" /></a></div>
You'll next be prompted by the "Oracle ODBC Driver Configuration" Dialog Box:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_lGtrj5YmHcaRdJkjbiP_Rm-W0GR0dDz9wwmduXKlupwoqGyJccWmSDf3gnawhbvX756nnEFXUXzf4qJC4lhN_ldB2dGIN3Oev_ziYoVsbFzhsseREC06vVjhuaJ_UVsod6qbhZ-6KtY/s1600/9.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="416" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_lGtrj5YmHcaRdJkjbiP_Rm-W0GR0dDz9wwmduXKlupwoqGyJccWmSDf3gnawhbvX756nnEFXUXzf4qJC4lhN_ldB2dGIN3Oev_ziYoVsbFzhsseREC06vVjhuaJ_UVsod6qbhZ-6KtY/s640/9.PNG" width="640" /></a></div>
In this dialog box, the "TNS Service Name" drop down box should display your entries from the tnsnames.ora file. Next, enter your Oracle User ID and click "Test Connection", at which point you'll be prompted for your password. Everything should test successfully at this point.<br />
<br />
Now would be a good time to restart. Unfortunately, yes you need to restart...<img border="0" height="16" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikLi0Is74EJs5JrcgbF60J1XUH_khEq39Cv8djs-Ke4v6hAGr3Y7mMr4i0yhz1jjNCrXNgfBcPbwQWbnQCCqjWh-zCKxi5FAlCeTItUpQRwDr3oJsVHh_9cQw73KnyDgHk804nGhhgVtQ/s1600/images.png" width="16" /><br />
<br />
You can do an additional test via sqlplus. Open a windows command prompt and enter the following:<br />
<br />
<pre class="lang-sql prettyprint prettyprinted" style="background-color: #eff0f1; border: 0px; margin-bottom: 1em; max-height: 600px; overflow: auto; padding: 5px; width: auto; word-wrap: normal;"><span style="color: #393318; font-family: "consolas" , "menlo" , "monaco" , "lucida console" , "liberation mono" , "dejavu sans mono" , "bitstream vera sans mono" , "courier new" , monospace , sans-serif;">sqlplus user/pass@[</span><span style="background-color: transparent;"><span style="color: #393318; font-family: "consolas" , "menlo" , "monaco" , "lucida console" , "liberation mono" , "dejavu sans mono" , "bitstream vera sans mono" , "courier new" , monospace , sans-serif;">addressname</span></span><span style="color: #393318; font-family: "consolas" , "menlo" , "monaco" , "lucida console" , "liberation mono" , "dejavu sans mono" , "bitstream vera sans mono" , "courier new" , monospace , sans-serif;">]</span></pre>
(Where addressname is one of your connections from tnsnames.ora)<br />
<br />
Next we need to setup our linked server. Pop open SQL Server Management Studio (SSMS) and drill down to "Server Objects" > "Linked Servers", right-click and choose "New Linked Server"<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiziK7vwWNkXgl_NSAPozPIHnDiPk2-kNZ3aqtfebHvd9hAJhaJSASoPRTv-m6j_iQC6NuhElpce9Kq8ot79MCgaK9uN_DPTD3cZlBAqd5cj3Y7pPO8Gv12wb9elUoqsdBL5TVK2Secbw/s1600/12.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiziK7vwWNkXgl_NSAPozPIHnDiPk2-kNZ3aqtfebHvd9hAJhaJSASoPRTv-m6j_iQC6NuhElpce9Kq8ot79MCgaK9uN_DPTD3cZlBAqd5cj3Y7pPO8Gv12wb9elUoqsdBL5TVK2Secbw/s1600/12.PNG" /></a></div>
<br />
Change the Provider to "Oracle Provider for OLE DB". Enter a product name of Oracle. For the "Data Source" enter your [addressname] from the tnsnames.ora file.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZNsxkQYhqjvQlU_fhyphenhyphen-L8wDYsL_KIOuRHbaPoSqHXxUrdmGYLQ0gn45cGUBZV_Ga_Z7V3AmK7hrcgDjZBFTVcoSnTTVaNdv84Z3-ODDiZbk4cnOUe30hul75ze1wxZEU6vNVURwxNZzo/s1600/13.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="572" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZNsxkQYhqjvQlU_fhyphenhyphen-L8wDYsL_KIOuRHbaPoSqHXxUrdmGYLQ0gn45cGUBZV_Ga_Z7V3AmK7hrcgDjZBFTVcoSnTTVaNdv84Z3-ODDiZbk4cnOUe30hul75ze1wxZEU6vNVURwxNZzo/s640/13.PNG" width="640" /></a></div>
Next is setting up the security. If you have a User ID and Password from your Oracle DBA, click Security and change the bottom radio button to "Be made using this security context", entering your credentials:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6R3js2-7WOVM_61UwNZLOHfPDaxTHNB7FsHCLnC4-GSuIz6rzlnc19aGFgWYCL8BuWEFPlux_Xdzpf8UlBUuWrFYQPtruuHZs1pZ4UzGvQdTOHxdFew9zyA8EGFXUeQYrYuzCZYa7Edo/s1600/14.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="578" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6R3js2-7WOVM_61UwNZLOHfPDaxTHNB7FsHCLnC4-GSuIz6rzlnc19aGFgWYCL8BuWEFPlux_Xdzpf8UlBUuWrFYQPtruuHZs1pZ4UzGvQdTOHxdFew9zyA8EGFXUeQYrYuzCZYa7Edo/s640/14.PNG" width="640" /></a></div>
The last page of the "New Linked Server" Dialog Box is Server Options. I typically leave the defaults, with one exception: I do like to change the "Lazy Schema Validation" from False to True:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIP2pWp7siOWA7_HvMvLemzbZp078-C57e9WRJeZE3KHNw3HOwfEzYe2UnSlsy7TE4rgeZ424qSDmOx0ertUbPwGTw3tK-PRwCIaQhCM-vDDGKlUOjKDz0UgJwUMAqGXGFBRfWdRoQrSU/s1600/15.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="575" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIP2pWp7siOWA7_HvMvLemzbZp078-C57e9WRJeZE3KHNw3HOwfEzYe2UnSlsy7TE4rgeZ424qSDmOx0ertUbPwGTw3tK-PRwCIaQhCM-vDDGKlUOjKDz0UgJwUMAqGXGFBRfWdRoQrSU/s640/15.PNG" width="640" /></a></div>
"The lazy schema validation option is set to true for performance reasons. It allows the query processor to skip schema checking of remote tables if the query can be satisfied on a single member server."<br />
<div>
<a href="http://www.amazon.com/gp/product/0672336928/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0672336928&linkCode=as2&tag=jonmorisissql-20&linkId=7TPYFDFX7PLCFCZW">REF: Microsoft SQL Server 2012 Unleashed</a>
</div>
<br />
One final test would be to attempt an <a href="https://msdn.microsoft.com/en-us/library/ms188427.aspx">OPENQUERY</a>:<br />
<br />
<pre class="lang-sql prettyprint prettyprinted" style="background-color: #eff0f1; border: 0px; margin-bottom: 1em; max-height: 600px; overflow: auto; padding: 5px; width: auto; word-wrap: normal;"><div style="font-family: 'Times New Roman'; white-space: normal;">
Select * from openquery</div>
<div style="font-family: 'Times New Roman'; white-space: normal;">
(NAME, 'SELECT * FROM [WHEREVER]') ALIAS</div>
</pre>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com1tag:blogger.com,1999:blog-6818257984018916419.post-210923920814526752016-05-04T12:52:00.000-07:002016-05-11T08:09:30.990-07:00Snarfing Data from a website and importing via the Import and Export WizardStep 1, Snarf the data. In my example I'm going to a publicly facing website and simply copying and pasting the html table into excel:<br />
<br />
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaMw7NCEAbWIU3F79NoOboCmdp1z5SjHZTtTvbWRAH8P7GRHZyXqmBeuVDEzWPnAJsGpPTKzgfc8_dyCKAJxbU0ZXRRpbp7GxWVHpu3Dv0cpDV7qOKlr8rg4dyO90qu76kIlXHoja1lrk/s1600/1.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="146" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaMw7NCEAbWIU3F79NoOboCmdp1z5SjHZTtTvbWRAH8P7GRHZyXqmBeuVDEzWPnAJsGpPTKzgfc8_dyCKAJxbU0ZXRRpbp7GxWVHpu3Dv0cpDV7qOKlr8rg4dyO90qu76kIlXHoja1lrk/s320/1.PNG" width="320" /></a></div>
<div style="text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7FkWwWxLuz7a18JQ91Iw7LS1kaV3J2bNSVz_3B7KRUa7AA6ldlkUD3usRKVES6Ma5iPBbFz1Mq7pJKsKovkTGS28j4mIYQ8U41MssCL3190QritC_JNwysO5L5TlRd_F-lZKelpoQJH4/s1600/2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="97" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7FkWwWxLuz7a18JQ91Iw7LS1kaV3J2bNSVz_3B7KRUa7AA6ldlkUD3usRKVES6Ma5iPBbFz1Mq7pJKsKovkTGS28j4mIYQ8U41MssCL3190QritC_JNwysO5L5TlRd_F-lZKelpoQJH4/s320/2.png" width="320" /></a></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
Next save this as a csv file.<br />
<br />
Now jump into SQL Server Management Studio, drill down to your database (you may want to create a new, empty database for your snarfing), right-click and start the Import and Export wizard, via "Import Data":<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvGa_gBsCn7z2aipH19OP0wQ8xnAKMbnL2losjPhMfx2OSi1C3UzDxh7dMZo9AjAgJlTF_iCg_1xJfRvwQ01d8QCF3Qf61Gbw0sKqTyM1jrVKpuiGoZ_C9ARq25C4fUwCOC-bjJufhjjE/s1600/3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvGa_gBsCn7z2aipH19OP0wQ8xnAKMbnL2losjPhMfx2OSi1C3UzDxh7dMZo9AjAgJlTF_iCg_1xJfRvwQ01d8QCF3Qf61Gbw0sKqTyM1jrVKpuiGoZ_C9ARq25C4fUwCOC-bjJufhjjE/s320/3.png" width="317" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
This is pretty much a next, next, next, finish scenario (with a few selections that need to be made). Just follow along with the screenshots:</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Next</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0YBNlFBxJ0FRfWKndkPKOoLdxyM0rAPA_-42pzx-59FPum5mEz1mdAwFtRH-JaSDafXurgOPbFzF8SDbwHUAM5u5pinXZQWQhJHhVDm5tg0oIwm4axzfWhn0CKmyjtPovX1ZtPOoatZM/s1600/4.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0YBNlFBxJ0FRfWKndkPKOoLdxyM0rAPA_-42pzx-59FPum5mEz1mdAwFtRH-JaSDafXurgOPbFzF8SDbwHUAM5u5pinXZQWQhJHhVDm5tg0oIwm4axzfWhn0CKmyjtPovX1ZtPOoatZM/s320/4.PNG" width="313" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Flat File Source (choose your .csv)</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifrM0levKyv6nagdsyZfDPRE6RYSG5DbETf6sASkYoZQ0v5IZ_KbXWtWnsBVT-OfG4CfA6F6Im6Yd8g9BK_pifCKOCoCmtactAtXSxVAbUrfaPubL7ufEScRD1PI_2zBFWzOD2g72hztc/s1600/5.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifrM0levKyv6nagdsyZfDPRE6RYSG5DbETf6sASkYoZQ0v5IZ_KbXWtWnsBVT-OfG4CfA6F6Im6Yd8g9BK_pifCKOCoCmtactAtXSxVAbUrfaPubL7ufEScRD1PI_2zBFWzOD2g72hztc/s320/5.PNG" width="311" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Next</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsMdo2u3gTPKFHZPcCQiEJUTTOju2v_Tj-KThsJxSiEfZuoz1jwJRE-3SoVDqCVBxG32zEQq44jksBqNhUOBsKyUPPm7qzJM_Or9qrhUMLCCsAk4Jdu-gHMvo13LuMASCv8W6ZlbIhPr8/s1600/6.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsMdo2u3gTPKFHZPcCQiEJUTTOju2v_Tj-KThsJxSiEfZuoz1jwJRE-3SoVDqCVBxG32zEQq44jksBqNhUOBsKyUPPm7qzJM_Or9qrhUMLCCsAk4Jdu-gHMvo13LuMASCv8W6ZlbIhPr8/s320/6.PNG" width="314" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Under Choose a Destination, change this to the "SQL Server Native Client"</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEp5lxdxsv4U7H5YI3INLOZJpSZq_kdds67ELqRaqBgkUYtSTJG9xwHFMOGO3nkHOpZD5G-mBX1vTX-qdMZt5NQE71h8dVGdxqVUAXJwpyLhnl_dNL1sXqrsrpxNOwvihdXwtuxEUOJ38/s1600/7.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEp5lxdxsv4U7H5YI3INLOZJpSZq_kdds67ELqRaqBgkUYtSTJG9xwHFMOGO3nkHOpZD5G-mBX1vTX-qdMZt5NQE71h8dVGdxqVUAXJwpyLhnl_dNL1sXqrsrpxNOwvihdXwtuxEUOJ38/s320/7.PNG" width="309" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Next, Next, Finish (you can preview your data if you like). By default your data will end up in a table with the same name as your csv file. You can also save the SSIS package for reuse if you like.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1lWk2in1DcBfDzuKVyEDaRpZ6OrRV8tYFtJjIJU11jCG-noEK44tbIMmyJf1phi6Iz_ZWS2Wyda9CWP5B_G3iYFe-oAkFJRD2E4Nr3_6PwCb9ZHb0dNcYgMu_edA8SparO5_YxIFg9a0/s1600/8.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1lWk2in1DcBfDzuKVyEDaRpZ6OrRV8tYFtJjIJU11jCG-noEK44tbIMmyJf1phi6Iz_ZWS2Wyda9CWP5B_G3iYFe-oAkFJRD2E4Nr3_6PwCb9ZHb0dNcYgMu_edA8SparO5_YxIFg9a0/s320/8.PNG" width="312" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
In the end you'll have all your data, copied from an html table, to a csv, and finally imported into SQL for your use:</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7WCyNkgGEK3t5U-gGiyHT7QnCigGG0ciwF-MSUdu8fuRV1ppp-98c2TTzGVxA_QY2jhTEPwDqcsLdNMZoskBbwtnIVYyX4MrhNM_I6bK5zGiDq99vnZs5bCjdqAsnAE76a7-S_goTbwI/s1600/9.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="76" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7WCyNkgGEK3t5U-gGiyHT7QnCigGG0ciwF-MSUdu8fuRV1ppp-98c2TTzGVxA_QY2jhTEPwDqcsLdNMZoskBbwtnIVYyX4MrhNM_I6bK5zGiDq99vnZs5bCjdqAsnAE76a7-S_goTbwI/s320/9.PNG" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
P.S.</div>
<div class="separator" style="clear: both; text-align: left;">
<a href="http://curatedsql.com/author/feaselkl/">Kevin Feasel</a> piggy backed on this article on curated sql and recommended this book:</div>
<br /></div>
<iframe frameborder="0" marginheight="0" marginwidth="0" scrolling="no" src="//ws-na.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&OneJS=1&Operation=GetAdHtml&MarketPlace=US&source=ac&ref=qf_sp_asin_til&ad_type=product_link&tracking_id=jonmorisissql-20&marketplace=amazon&region=US&placement=1593273975&asins=1593273975&linkId=GPDPOJ7KDSBEASP3&show_border=true&link_opens_in_new_window=true" style="height: 240px; width: 120px;">
</iframe>Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0tag:blogger.com,1999:blog-6818257984018916419.post-2794957918498886272016-04-12T09:50:00.001-07:002016-04-12T09:50:01.163-07:00Favorite SQL Server Feature - Replication<br />
<a href="http://t-sql.dk/?p=1492" target="_blank"><img alt="TSQL Tuesday" border="0" src="http://img.bobpusateri.com/bc/2010/06/TSQL2sDay150x150.jpg" /></a><br />
<a href="http://t-sql.dk/?p=1492" target="_blank">T-SQL Tuesday #77: Favorite SQL Server Feature</a><br />
<div>
<br /></div>
<br />
Let the weeping and gnashing of teeth begin, fore I have chosen...REPliCAtion! as my favorite SQL Server feature.<br />
<br />
Seriously though, <a href="https://technet.microsoft.com/en-us/library/aa237134(v=sql.80).aspx" target="_blank">replication has been around since the beginning</a> and <a href="https://msdn.microsoft.com/en-us/library/ms151198.aspx?f=255&MSPPError=-2147217396" target="_blank">it's not going anywhere</a>. I can't think of any other feature more prolific than replication. Name another SQL Server HA/DR technology that is as extensible as replication. Replication has gotten a bad rap over the years mostly on anecdotal comments that it "breaks all the time" or "it takes too much time to manage". I've worked in many environments and have setup dozens and dozens of instances of log shipping, mirroring, clusters, availability groups, and replication. From my anecdotal experience, I can tell you I've had more trouble with availability groups than I have with replication. If you have a good DBA that understands replication, uses it correctly, and is provided the correct tools (read $ for hardware/infrastructure) replication works just fine. I have setup replication in a global environment in which multiple databases, publications, subscriptions, and agents ran around the clock and without issue.<br />
<br />
For starters you have three kinds:<br />
<ol>
<li>Snapshot - scheduled point in time refresh</li>
<li>Transactional - low latency active sync</li>
<li>Merge - consolidation of multiple sources (more or less writable syncing subscribers).</li>
</ol>
<div>
Then you can pile on a whole list of features not available in the other SQL technologies:</div>
<div>
<ul>
<li>Replicate data to any ODBC or OLE DB accessible database including Oracle, AKA heterogeneous replication.</li>
<li>Subscribers are readable</li>
<li>Updateable subscribers</li>
<li>Peer-to-peer</li>
<li>Ability to create different indexes on the subscribers</li>
<li>Ability to selectively choose which pieces of data (articles) to replicate (i.e. not all-or-nothing)</li>
<li>Atricle Filtering (vertical and horizontal)</li>
<li>Stored procedure executions as articles</li>
<li>Flexible enough to work over slow connections</li>
</ul>
<div>
REF:</div>
<div>
Chapter 43 of <a href="http://www.amazon.com/Microsoft-SQL-Server-2014-Unleashed/dp/0672337290/ref=sr_1_1?s=books&rps=1&ie=UTF8&qid=1460478565&sr=1-1&keywords=Microsoft+SQL+Server+2014+Unleashed&refinements=p_85%3A2470955011" target="_blank">Microsoft SQL Server 2014 Unleashed</a> is excellent!</div>
</div>
Jon Morisihttp://www.blogger.com/profile/11048505964243402700noreply@blogger.com0