Imagine you are working on a job with a Merge Node, and on first build you have a very selective and specific list of joins between two sources. All good! Now suppose later on, for whatever reason, you have to make edits to your sources- the schema changes in your csv file, you want to replace one year’s data with that of another, etc. Construct, in its genius, will re-scan the source to account for any column changes. Accordingly, your joins will be reset. This is exactly what you want in most situations, but what if you want to keep your joins the way they were before?
Be not broken-hearted, here is a little trick: in between the merge node and each source, insert a Filter node. There is no need to do anything with the Filter settings, just putting it there is enough. Here is an example:
Because the Filter node will not change when the sources do, the joins in your Merge node will be maintained. Think of it as a bushing for your incoming data, or a simple interconnect for a swappable component. Go ahead and switch your input nodes or modify your column list- so long as the columns you wish to include in the output of the Merge are maintained, your joins will be maintained as well. Brilliant, right?
-Ryan Moran 12/14/10