python – Any reason not to use + to concatenate two strings?

python – Any reason not to use + to concatenate two strings?

There is nothing wrong in concatenating two strings with `+`. Indeed its easier to read than `.join([a, b])`.

You are right though that concatenating more than 2 strings with `+` is an O(n^2) operation (compared to O(n) for `join`) and thus becomes inefficient. However this has not to do with using a loop. Even `a + b + c + ...` is O(n^2), the reason being that each concatenation produces a new string.

CPython2.4 and above try to mitigate that, but its still advisable to use `join` when concatenating more than 2 strings.

Plus operator is perfectly fine solution to concatenate two Python strings. But if you keep adding more than two strings (n > 25) , you might want to think something else.

`.join([a, b, c])` trick is a performance optimization.

python – Any reason not to use + to concatenate two strings?

The assumption that one should never, ever use + for string concatenation, but instead always use .join may be a myth. It is true that using `+` creates unnecessary temporary copies of immutable string object but the other not oft quoted fact is that calling `join` in a loop would generally add the overhead of `function call`. Lets take your example.

Create two lists, one from the linked SO question and another a bigger fabricated

``````>>> myl1 = [A,B,C,D,E,F]
>>> myl2=[chr(random.randint(65,90)) for i in range(0,10000)]
``````

Lets create two functions, `UseJoin` and `UsePlus` to use the respective `join` and `+` functionality.

``````>>> def UsePlus():
return [myl[i] + myl[i + 1] for i in range(0,len(myl), 2)]

>>> def UseJoin():
[.join((myl[i],myl[i + 1])) for i in range(0,len(myl), 2)]
``````

Lets run timeit with the first list

``````>>> myl=myl1
>>> t1=timeit.Timer(UsePlus(),from __main__ import UsePlus)
>>> t2=timeit.Timer(UseJoin(),from __main__ import UseJoin)
>>> print %.2f usec/pass % (1000000 * t1.timeit(number=100000)/100000)
2.48 usec/pass
>>> print %.2f usec/pass % (1000000 * t2.timeit(number=100000)/100000)
2.61 usec/pass
>>>
``````

They have almost the same runtime.

Lets use cProfile

``````>>> myl=myl2
>>> cProfile.run(UsePlus())
5 function calls in 0.001 CPU seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1    0.001    0.001    0.001    0.001 <pyshell#1376>:1(UsePlus)
1    0.000    0.000    0.001    0.001 <string>:1(<module>)
1    0.000    0.000    0.000    0.000 {len}
1    0.000    0.000    0.000    0.000 {method disable of _lsprof.Profiler objects}
1    0.000    0.000    0.000    0.000 {range}

>>> cProfile.run(UseJoin())
5005 function calls in 0.029 CPU seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1    0.015    0.015    0.029    0.029 <pyshell#1388>:1(UseJoin)
1    0.000    0.000    0.029    0.029 <string>:1(<module>)
1    0.000    0.000    0.000    0.000 {len}
1    0.000    0.000    0.000    0.000 {method disable of _lsprof.Profiler objects}
5000    0.014    0.000    0.014    0.000 {method join of str objects}
1    0.000    0.000    0.000    0.000 {range}
``````

And it looks that using Join, results in unnecessary function calls which could add to the overhead.

Now coming back to the question. Should one discourage the use of `+` over `join` in all cases?

I believe no, things should be taken into consideration

1. Length of the String in Question
2. No of Concatenation Operation.

And off-course in a development pre-mature optimization is evil.