[elbe-devel] [PATCH 07/25] py3: care on encoding of license files

Manuel Traut manuel.traut at linutronix.de
Fri Dec 8 18:45:46 CET 2017


On Wed, Dec 06, 2017 at 11:08:44AM +0100, Torben Hohn wrote:
> On Fri, Dec 01, 2017 at 04:51:04PM +0100, Manuel Traut wrote:
> > This tries to find the correct encoding of a copyright file and write the
> > decoded content of the copyright file to licence.xml.
> > 
> > Licence texts that can't be decoded are dropped and an error message is
> > logged.
> 
> Crap
> 
> > 
> > Signed-off-by: Manuel Traut <manut at linutronix.de>
> > ---
> >  elbepack/efilesystem.py | 30 ++++++++++++++++++++----------
> >  elbepack/licencexml.py  | 18 ++++++++++--------
> >  2 files changed, 30 insertions(+), 18 deletions(-)
> > 
> > diff --git a/elbepack/efilesystem.py b/elbepack/efilesystem.py
> > index be7d7e25..bd9f7453 100644
> > --- a/elbepack/efilesystem.py
> > +++ b/elbepack/efilesystem.py
> > @@ -142,18 +142,28 @@ class ElbeFilesystem(Filesystem):
> >                          (os.path.join(dir, "copyright"), e.strerror))
> >                  lic_text = "Error while processing license file %s: '%s'" % (os.path.join(dir, "copyright"), e.strerror)
> >  
> > -            try:
> > -                lic_text = unicode (lic_text, encoding='utf-8')
> > -            except:
> > -                lic_text = unicode (lic_text, encoding='iso-8859-1')
> > -
> > -
> >              if not f is None:
> >                  f.write(unicode(os.path.basename(dir)))
> > -                f.write(u":\n================================================================================")
> > -                f.write(u"\n")
> > -                f.write(lic_text)
> > -                f.write(u"\n\n")
> > +                f.write(unicode(":\n================================================================================"))
> > +                f.write(unicode("\n"))
> 
> what is this ?
> 2to3 converting u"" to unicode"" ?
> to my understanding, py3 strings are always unicode, if they are not b''
> (bytes).

but the code needs to be written in a way that it runs in py2 and 3 at the
moment. We can remove that if we completely switched to py3.

> > +                try:
> > +                    f.write(lic_text.encode('utf-8'))
> 
> This is what bpython 3 says:
> >>> f = open("/tmp/xxxtest","w")
> >>> f
> <_io.TextIOWrapper name='/tmp/xxxtest' mode='w' encoding='UTF-8'>
> >>> f.write ("abc".encode ("utf-8"))
> Traceback (most recent call last):
>   File "<input>", line 1, in <module>
> TypeError: must be str, not bytes

ok, needs rework.

> > +                except TypeError as e:
> > +                    log.printo(e)
> > +                    log.printo("error by writing licence of %s" % (os.path.join(dir, "copyright")))
> > +                    log.printo(str(lic_text))
> > +                    f.write(unicode(str(lic_text)))
> > +                except UnicodeDecodeError as e:
> > +                    log.printo(e)
> > +                    log.printo("error by writing licence of %s" % (os.path.join(dir, "copyright")))
> > +                    log.printo(str(lic_text))
> > +                    n = f.name
> > +                    f.close()
> > +                    with open (n, 'a') as f:
> > +                        f.write(lic_text)
> > +                    f = open(n, 'ab')
> 
> If we have some technical problem with ONE license, we just emit an Error ?
> This is not the right thing todo.
> Sorry. Either ALL licenses, or no licenses. Talk with tglx about this.

That's true. don't need to talk to anybody about that. Just thinking is enough.

But the current code raises an exception that interrupts the hole build.
This means, if you use a package that triggers an encoding exception you are
not able to build the image, that's a nogo.

I'm quite unhappy with the licence code. It's hard to understand and
undocumented, also the encoding issue needs to be resolved.

I'll post a patch with a big try/except around the licence code in my v2 series
and drop this horrible patch.

If we do the rewrite, we can cover the py3 conversion with that [0]

[0] https://github.com/Linutronix/elbe/issues/133

> > +
> > +                f.write(unicode("\n\n"))
> >  
> >              if not xml_fname is None:
> >                  licence_xml.add_copyright_file (os.path.basename(dir), lic_text)
> > diff --git a/elbepack/licencexml.py b/elbepack/licencexml.py
> > index a40dffcd..d2c4979a 100644
> > --- a/elbepack/licencexml.py
> > +++ b/elbepack/licencexml.py
> > @@ -66,9 +66,17 @@ class copyright_xml (object):
> >          xmlpkg = self.pkglist.append('pkglicense')
> >          xmlpkg.et.attrib['name'] = pkg_name
> >          txtnode = xmlpkg.append ('text')
> > -        txtnode.et.text = copyright
> >  
> > -        bytesio = io.StringIO (unicode(txtnode.et.text))
> > +        # just return if we cant decode the copyright file; we also return
> > +        # if we can't interpret it, so this should be ok
> > +        try:
> > +            txtnode.et.text = unicode(copyright)
> > +            bytesio = io.StringIO (txtnode.et.text)
> > +        except TypeError as e:
> > +            return
> > +        except UnicodeDecodeError as e:
> > +            return
> > +
> 
> This also looks broken.
> 
> >          try:
> >              c = Copyright (bytesio)
> >              files = []
> > @@ -125,9 +133,3 @@ class copyright_xml (object):
> >  
> >      def write(self, fname):
> >          self.outxml.write (fname, encoding="iso-8859-1")
> > -
> > -        
> > -
> > -
> > -
> > -
> > -- 
> > 2.15.1
> > 
> > 
> > _______________________________________________
> > elbe-devel mailing list
> > elbe-devel at linutronix.de
> > https://lists.linutronix.de/mailman/listinfo/elbe-devel
> 
> -- 
> Mit freundlichen Grüßen
> Torben Hohn
> 
> Linutronix GmbH
> 
> Standort: Bremen
> 
> Phone: +49 7556 25 999 18; Fax.: +49 7556 25 999 99
> 
> Firmensitz / Registered Office: D-88690 Uhldingen, Bahnhofstr. 3
> Registergericht / Local District Court: Amtsgericht Freiburg i. Br.; HRB
> Nr. / Trade register no.: 700 806
> 
> Geschäftsführer / Managing Directors: Heinz Egger, Thomas Gleixner
> 
> Eine Bitte von uns: Sollten Sie diese E-Mail irrtümlich erhalten haben,
> benachrichtigen Sie uns in diesem Falle bitte sobald wie es Ihnen
> möglich ist, durch Antwort-Mail. Vielen Dank!





More information about the elbe-devel mailing list