Posts

Showing posts from 2020

Sphinx Search engine and related technologies

Image
  Last month, I ended publishing my article series about the Sphinx search engine and related technologies on SQL Shack:   Getting started with Sphinx search engine In this article, we talked about the Sphinx search engine and how to install it on the Windows operating system. Building full-text indexes using the Sphinx search engine In this article, we talked about building full-text indexes using Sphinx . We covered the following topics: The Sphinx configuration file Defining data sources Defining full-text indexes Indexer and Searchd services options Database preparation Indexing Using Linked servers to access indexes Manticore search: a continuation of the Sphinx search engine In this article, we talked about Manticore Search, which is an open-source search engine first released in 2017 as a fork of the Sphinx search engine  

Hadoop components for SSIS

Image
This month I published a series on SQL Shack about the Hadoop components added to SSIS in SQL Server 2016 release. The series is composed of three articles as following: SSIS Hadoop Connection Manager and related tasks In this article, I gave a brief introduction to Hadoop and how it is integrated with SQL Server. Then, I illustrated how to connect to the Hadoop cluster on-premises using the SSIS Hadoop connection manager and the related tasks. Importing and Exporting data using SSIS Hadoop components In this article, I briefly explained the Avro and ORC Big Data file formats. Then, I talked about Hadoop data flow task components and how to use them to import and export data into the Hadoop cluster. Then I compared those Hadoop components with the Hadoop File System Task. Finally, I concluded my work. Connecting to Apache Hive and Apache Pig using SSIS Hadoop components In this article, I talked about Hadoop Hive and Hadoop Pig Tasks. I first gave a brief

Don’t install Hadoop on Windows!

Image
A few years ago, I was hearing from my colleagues “don’t ever think about installing Hadoop on Windows operating system!” . I was not convinced of this saying because I am a big fan of Microsoft products, especially Windows. In the past three years, I worked on three projects where I was asked to build a Hadoop cluster on Ubuntu. The first time, it was a single-node Hadoop installation with a single Apache Spark worker. The other projects were about building a Big Data ecosystem for radiation data engineering , where the multi-node Hadoop cluster is deployed. Besides Hadoop, we installed and configured Apache Kafka, Spark, Hive, Pig, and Flume installations (I have published some installation guides previously, you can check the links at the end of this article). It was hard to become familiar with those technologies for the first time since they don’t have much documentation online. Each time I had to install Hadoop, I was thinking that why always Hadoop is installed on Linu

Learn Biml for SSIS

Image
A few months ago, I started writing a new series of articles about Business Intelligence Markup Language (BIML) for the SQL Shack website. After finishing writing about using BIML in SQL Server Integration Services, I would like to share with you these articles: Getting Started With Biml In this article, we will give a brief overview of BIML by showing how it can be installed and how we can use it to generate SSIS packages using the BimlExpress tool. Finally, we will provide some useful resources to learn this technology. Converting SSIS packages to Biml scripts In this article, we will explain how to generate scripts from existing SSIS packages by showing all related options. Then, we will analyze the generated script and identify how each object is mentioned in it. Using Biml scripts to generate SSIS packages In this article, we will explain how to create SSIS packages using business intelligence markup language by providing a step-by-step guide where we will il

SSIS features face-to-face

Image
SQL Server Integration Services provides a wide variety of features that helps developers to build a robust Extract, Transform and Load process. After many years contributing to SSIS related tags on Stackoverflow.com, I can say that many developers have some misunderstanding about SSIS features (SSIS OLE DB Source, SSIS Expressions, SQL Server destination …) especially those which are very similar and have some common usability. Last year, I started writing SQL related articles for the SQL Shack blog. In this post, I would like to share with you a series I have published a few months ago called "SSIS features face-to-face" which is composed of 11 articles where I made a comparison between similar SSIS features. SSIS OLE DB Source: SQL Command vs Table or View In this article, we do a comparison between SSIS OLE DB Source data access modes:  “Table or View”  with Specific columns selected Vs. Using  “SQL Command”  to select specific columns. SSIS Expression