byronwall/bUTL

Join columns only works for a single row of data

Closed this issue · 5 comments

It would be good if this feature was able to iterate through a block of rows and process each row individually. This is the same behavior as the split feature which will go through an entire row.

Is this in reference to CombineCells()?

Yes, I should have been more explicit. There is probably a better name for that Sub as well. I think something like below will work. The idea is that you iterate through Rows and do the combination for each row on its own. For this one as it sits, it does not maintain any discontinuity in the input range. Not sure how best to handle that going forward. There is also the question of how to handle blanks. When this Sub was created, it was meant to be the reverse of the split into columns option so blanks were meaningful and needed to be delimited. I think there was a specific reason I wanted to do that back then.

Sub CombineCells()

    Dim rngInput As Range
    On Error GoTo errHandler
    Set rngInput = Application.InputBox("Select the range of cells to combine:", Type:=8)

    Dim strDelim As String
    strDelim = Application.InputBox("Delimeter:")
    If strDelim = "" Then GoTo errHandler
    If strDelim = "False" Then GoTo errHandler
    Dim rngOutput As Range
    Set rngOutput = Application.InputBox("Select the output range:", Type:=8)

    If rngOutput Is Nothing Then
        GoTo errHandler
    End If

    'only keep the upper right corner of output
    Set rngOutput = rngOutput.Cells(1, 1)

    Dim rngRow As Range
    For Each rngRow In rngInput.rows
        Dim arr_values As Variant
        arr_values = Application.Transpose(Application.Transpose(rngRow.Value))

        rngOutput = Join(arr_values, strDelim)
        'next cell will output one row lower
        Set rngOutput = rngOutput.Offset(1)
    Next rngRow


    Exit Sub
errHandler:
    MsgBox ("No Range or Delimiter Selected!")
End Sub

The problem with Join is that it requires a one dimensional array. If you still want to use the array for speed, I'm sure it could be adopted. I went at it like this -

(ha! I finally figured out markdown here)

Sub CombineCells()
    'collect all user data up front
    Dim rngInput As Range
    On Error GoTo errHandler
    Set rngInput = Application.InputBox("Select the range of cells to combine:", Type:=8)

    Dim strDelim As String
    strDelim = Application.InputBox("Delimeter:")
    If strDelim = "" Then GoTo errHandler
    If strDelim = "False" Then GoTo errHandler
    Dim rngOutput As Range
    Set rngOutput = Application.InputBox("Select the output range:", Type:=8)

    'Check the size of input and adjust output
    Dim y As Long
    y = rngInput.Columns.Count

    Dim x As Long
    x = rngInput.Rows.Count

    rngOutput = rngOutput.Resize(x, 1)

    'Read input rows into a single string
    Dim strOutput As String
    For i = 1 To x
        strOutput = vbNullString
        For j = 1 To y
            strOutput = strOutput & strDelim & rngInput(i, j)
        Next
        'Get rid of the first character (strDelim)
        strOutput = Right(strOutput, Len(strOutput) - 1)
        'Print it!
        rngOutput(i, 1) = strOutput
    Next
    Exit Sub
errHandler:
    MsgBox ("No Range or Delimiter Selected!")
End Sub

I am cheating with the Join call using the double Transpose to force the 2D array (.Value) back down to 1D (arr_values) and usingJoinon that 1D array. Since I am iterating by rows, the underlying data is 1D but.Value` returns the 2D array by default. That code I posted was tested and it worked.

If we want to be fast with an array for output, the code above could use Join for each row the way it is above but output the result into an array which is then dumped back into the sheet.

As a general point, I know that it's good form to work with arrays when outputting data back to a multi-cell Range. I've always been lazy about using them because I normally don't notice the speed impact with the size of ranges I deal with. I also have generally erred on the side of allowing discontinuous ranges as input/output (if you can't tell 😄) and array outputs tend to break that (or at least overwrite cells in middle that should have not been touched).

I'd support using the array for output for an example like this one because I don't think the discontinuous range is of much concern and it's possible this is done to a large range of data.

This was also addressed with PR #22. I went with your version for now. It works when I tested it, and I am not too worried about implementation. We probably want to come back to this one and have it output the array instead of going cell by cell.

I will run through the code and find other places where array output would be more appropriate. From there, I will create new issues looking into that.