[Sugar-devel] Analysing code changes, git diff

Tomeu Vizoso tomeu at sugarlabs.org
Thu Sep 9 09:01:01 EDT 2010


On Thu, Sep 9, 2010 at 14:08, Sascha Silbe
<sascha-ml-reply-to-2010-3 at silbe.org> wrote:
> Excerpts from James Cameron's message of Thu Sep 09 03:24:36 +0200 2010:
>
>> Sorry, I used tabify in emacs, so as to force the diff to show the
>> entire function as changed.  I had said I "replaced tabs with spaces",
>> but I should have said "replaced spaces with tabs".  This is only
>> because I don't know how to drive git diff well enough.
>
> This brings up something I have been brooding about for some time now:
>
> The default output of "git diff" often isn't very useful for
> understanding a patch or even just merging changes. It's a low-level
> representation and, when comparing Python code, often not even a good
> one.
>
> I know git can be configured to use external "diff drivers" and "merge
> drivers" (see also [6]), so maybe we can improve this. That's why I
> started a new thread: Others might be aware of tools that can be used,
> either as-is or after some integration work (or at least on their own).

Don't know about better diffs, but I like meld as driven by git mergetool.

Regards,

Tomeu

> There are two different things that bug me about "git diff" when applied
> to the Sugar source code:
>
>
> 1. The built-in algorithm is generic (good) and always language-
>   agnostic (bad). For Python code, this often results in:
>
>   I. Choosing unsuitable borders.
>      Example:
>
> @@ -81,15 +82,19 @@ def _async_copy(self, file_path, destination_path, completion_cb,
>                 unlink_src)
>         async_copy.start()
>
> -    def retrieve(self, uid, user_id, extension):
> +    def has_data(self, object_id):
> +        file_path = layoutmanager.get_instance().get_data_path(object_id)
> +        return os.path.exists(file_path)
> +
> +    def retrieve(self, object_id, user_id, extension):
>         """Place the file associated to a given entry into a directory
>            where the user can read it. The caller is reponsible for
>            deleting this file.
>
>         """
>
>      This mixes up the changes to retrieve (only signature changed) and
>      adding a new method has_data(). When handling conflicts, the above
>      output is unnecessarily hard to understand and resolve.
>      What I would have liked to see instead is:
>
> +    def has_data(self, object_id):
> +        file_path = layoutmanager.get_instance().get_data_path(object_id)
> +        return os.path.exists(file_path)
> +
> -    def retrieve(self, uid, user_id, extension):
> +    def retrieve(self, object_id, user_id, extension):
>         """Place the file associated to a given entry into a directory
>            where the user can read it. The caller is reponsible for
>            deleting this file.
>
>         """
>
>
>   II. Less useful "hunk headers".
>       Default output from git diff:
>
> @@ -54,7 +251,8 @@ class DataStore(dbus.service.Object):
>                                         bus=dbus.SessionBus(),
>                                         replace_existing=False,
>                                         allow_replacement=False)
> -        dbus.service.Object.__init__(self, bus_name, DS_OBJECT_PATH)
> +        dbus.service.Object.__init__(self, bus_name, DS_OBJECT_PATH2)
> +        self._api_v1 = DBusAPIv1(self)
>
>         migrated = self._migrate()
>
>      I can see that the change happens in class DataStore, but not
>      in which function (line numbers and even context can vary
>      massively in my version support branch).
>      If I set diff=python in .gitattributes, I get:
>
> @@ -54,7 +251,8 @@ def __init__(self, **options):
>                                         bus=dbus.SessionBus(),
>                                         replace_existing=False,
>                                         allow_replacement=False)
> -        dbus.service.Object.__init__(self, bus_name, DS_OBJECT_PATH)
> +        dbus.service.Object.__init__(self, bus_name, DS_OBJECT_PATH2)
> +        self._api_v1 = DBusAPIv1(self)
>
>         migrated = self._migrate()
>
>      OK, so it's in method __init__(), but of which class?
>      Both combined would be great:
>
> @@ -54,7 +251,8 @@ DataStore.__init__(self, **options):
>
>
>
> 2. What I actually want to see most of the time is some kind of
>   "semantic diff". Something that shows me the refactorings that have
>   taken place, like "factor out the following block of code from foo
>   into _bar". I know there are tools to perform these kind of changes
>   [1-3] but I don't know of anything that can analyse some source code
>   and output the differences in terms of refactorings (and additional
>   code changes). There's at least one tool that seems to do something
>   along this line (though its output isn't exactly easy to read [4])
>   but it's proprietary, Windows-only software [5].
>   SSDDiff [7] looks fine and is open source, but only operates on XML.
>
>   Research on this topic has been going on for at least two decades [8],
>   but what I'm looking for is actual, usable tools, not papers.
>   Something like SSDDiff operating on ASTs instead of XML would be
>   awesome.
>
>
> Thanks for taking the time to read my entire post. Looking forward to
> any suggestion on what tools we can use to better analyse code changes.
>
> Sascha (wishing for wiki formatting in EMails)
>
> [1] http://bicyclerepair.sourceforge.net/
> [2] http://www.eclipse.org/
> [3] http://www.eclipse.org/articles/article.php?file=Article-Unleashing-the-Power-of-Refactoring/index.html
> [4] http://www.semdesigns.com/Products/SmartDifferencer/CSmartDifferencerExample.html#CSharpSmartDifferencerOutput
> [5] http://www.semdesigns.com/Products/SmartDifferencer/CSmartDifferencer.html
> [6] http://idnotfound.wordpress.com/2009/05/09/word-by-word-diffs-in-git/
> [7] http://ssddiff.alioth.debian.org/
> [8] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.3377
> --
> http://sascha.silbe.org/
> http://www.infra-silbe.de/
>
> _______________________________________________
> Sugar-devel mailing list
> Sugar-devel at lists.sugarlabs.org
> http://lists.sugarlabs.org/listinfo/sugar-devel
>
>


More information about the Sugar-devel mailing list